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Go forth and replicate! 


To make replication studies more useful, researchers must make more of them, funders must 


encourage them and journals must publish them. 


promising study: much better to know what happened when 
others tried it. Long before replication or reproducibility 
became major talking points, scientists had strategies to get the word 
out. Gossip was one. Researchers would compare notes at conferences, 
and a patchy network would be warned about whether a study was 
worth building on. Or a vague comment might be buried in a related 
publication. Tell-tale sentences would start “In our hands’, “It is unclear 
why our results differed...” or “Interestingly, our results did not...” 
What might seem obvious — a paper on attempts and outcomes — 
was almost never an option. Many journals refused to consider replica- 
tion studies, anda lot of researchers had no desire to start a feud if their 
results did not match. So scientists not in the know might waste time 
exploring a blind alley or be wary about truly promising research. 
Things are improving. Nowadays, researchers who want to tell the 
scientific community about their replication studies have multiple ways 
to do so. They can chronicle their attempts on a blog, post ona preprint 
server or publish peer-reviewed work in those journals that do not 
require novelty. Just this year, the online platform F1000 launched the 
dedicated Preclinical Reproducibility and Robustness channel for refu- 
tations, confirmations or more nuanced replication studies. Other titles, 
including Scientific Data and the American Journal of Gastroenterology, 
have openly solicited replication attempts and negative results. In 2013, 
after controversial work on whether bioactive RNA molecules could 
cross from the digestive tract to the bloodstream, Nature Biotechnol- 
ogy declared itself “receptive to replication’, provided that such studies 
illuminate crucial research questions (Nature Biotechnol. 31, 943; 2013). 
The psychology community is a leader in this: Perspectives on Psy- 
chological Science has begun publishing a new type of article, and pio- 
neering a new form of collaboration. It asks psychologists to nominate 
an influential study for replication and to draw up a plan. The original 
author is invited to offer suggestions on the protocol, multiple labs 
volunteer to collect data, and results — whatever they may be — are 
published as a registered replication report (RRR). So far, three have 
been published, each with a perspective by the original authors. 


N: scientist wants to be the first to try to replicate another's 


GET IT OUT THERE 

Yet it would be inefficient to pursue such projects for more than a 
sliver of publications. Most replication attempts are not organized 
collaborations, but individual laboratories testing the next stage of 
their research. If those results were shared, science would benefit. 

Why doesn’t this happen more often? Because the replication 
ecosystem, such as it is, lacks visibility, value and conventions. 

When a researcher happens on an exciting paper, there is no easy way 
to learn about replication attempts. Replication studies are not auto- 
matically or consistently linked to original papers on journal websites, 
PubPeer or PubMed. When a replication attempt is mentioned in pass- 
ing in a broader study, there is no way to capture it. Journals cannot 


be expected to curate all replication attempts of papers they publish, 
although they should support technology that aggregates and dissemi- 
nates that information. And they should be open to publishing in-depth 
replication attempts for original papers. For example, Scientific Reports 
encourages critique by offering to waive its article-processing charge for 
a peer-reviewed refutation of an article published in the journal. 
Increased visibility would raise the value of a replication attempt, 
but also increase the risk of retaliation against 


“Conventions replicators. There is little reward for taking 
around : that risk. A published replication currently 
replication are does little to raise the esteem of the replicator 
in their infancy with hiring committees or grant reviewers. 
— even the This creates a chicken-egg problem — 
vocabulary is researchers don’t want to conduct and pub- 


inadequate.” lish rigorous replication studies because they 
are not valued, and replication studies are not 
valued because few are published. Commendably, funders such as the 
Laura and John Arnold Foundation in the United States and the Neth- 
erlands Organisation for Scientific Research are explicitly supporting 
replication studies, and setting high expectations for publication. Scien- 
tists can help to ensure that such studies are valued by citing them and 
by discussing them on social media. 

Conventions around replication studies are in their infancy — even 
the vocabulary is inadequate. Editors who coordinate RRRs strive to 
avoid loaded labels such as ‘successful’ and ‘failed’ replications. The 
Reproducibility Initiative, a project to help labs coordinate inde- 
pendent replications of their own work, also shied away from similar 
pronouncements after its first study. A paper is a jumble of context, 
experiments, results, analysis and informed speculation. Outcomes 
can depend on apparently trivial differences in methods, such as how 
vigorously reagents are mixed, as one collaboration painstakingly 
discovered (W. C. Hines et al. Cell Rep. 6, 779-781; 2014). 

Neither are there conventions for interactions between replicators and 
the original authors. Some original authors have refused to share data 
or methodological details. In other cases, some replicators broadcast 
their attempts without first trying to resolve inconsistencies, a practice 
that leaves them more open to charges of incompetence. (Thankfully, 
both replicators and original authors are now backing away from name- 
calling.) As replication becomes more mainstream, we trust that the 
community will establish reasonable standards of conduct. 

To foster better behaviour, replication attempts must become more 
common. We urge researchers to open their file drawers. We urge 
authors to cooperate with reasonable requests for primary data, to 
assume good intent and to write papers — and keep records — assum- 
ing that others will want to replicate their work. We urge funders and 
publishers to support tools that help researchers to thread the literature 
together. We welcome, and will be glad to help disseminate, results that 
explore the validity of key publications, including our own. m 
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XUEYAN HU 


WORLD VIEW .jecnnicor sen 


of the nation’s air and water has garnered a lot of attention 
recently. Now, focus is turning to another polluted realm: the 
very ground beneath China’ feet. 

Chinese researchers are still digesting the implications of a national 
action plan on soil quality released in May. But two problems are 
already clear, neither of them unique to China. The first is that reme- 
diation of polluted soil is very expensive. The second is that binding 
legislation must be in place to make sure the necessary money is spent. 

The consequences of failing to act properly were graphically dem- 
onstrated in April. Nearly 500 teenagers at a school constructed near 
three former chemical plants in Changzhou, east China, were found 
to have developed health problems ranging from nosebleeds to lym- 
phoma and leukaemia. Preliminary investigations found the soils and 
groundwater to be laced with heavy metals and 
organic pollutants. In particular, chlorobenzene 
was found at concentrations nearly 10,000 times 
the safe level. 

Ironically, problems like this are the con- 
sequence of concerns about pollution. Dirty 
factories and industrial plants moved from urban 
areas to the suburbs and rural locations. Their 
former sites were demolished, and schools and 
houses were built in their place. 

There are thousands of brownfield sites across 
China, and many are heavily polluted. Serious pol- 
lution risks were confirmed by the authorities in 
China's soil-pollution survey, once guarded as a 
national secret but publicly released in April 2014. It claimed that 16% 
of China's soils exceed national standards for pollution with heavy met- 
als and pesticides and, remarkably, that 34.9% of brownfields exceeded 
national pollution standards. 

Buta lack of transparent detailed information on locations and pollu- 
tion levels at specific sites hinders public awareness and efforts to tackle 
the problem. Unlike the often visible pollution of air or water, soil con- 
tamination is difficult to detect without professional equipment. Many 
brownfield sites have already been used for businesses or housing with 
insufficient remediation. And until something changes, seriously pol- 
luted sites will continue to imperil the environment. 

The nationwide action plan calls for new laws to monitor, prevent 
and remediate the soil pollution. It is very ambitious, aiming to curb 
new soil pollution by 2020 and incrementally improve soil quality 
across the country by the middle of this century. 

To implement the plan fully would be extremely costly, at more 
than 7 trillion yuan (US$1.06 trillion) according to one estimate. To 
put that into perspective, the Chinese government's investment in soil 
remediation this year is around 9 billion yuan. 

How can the gap be closed? 

The common ‘polluter pays’ principle applies in China, but soil 


Te impact of the booming Chinese economy on the quality 


LOW-COST AND 


EFFECTIVE 


NEW TECHNOLOGIES 
ARE URGENTLY 


China’s soil plan needs 
strong support 


The government must accompany its action plan on soil quality with effective 
laws and remediation measures, says Hong Yang. 


pollution can take a long time to emerge and be detected. That makes 
it difficult to track and locate responsible parties, and these historical 
polluters often have no means to pay. Other funding mechanisms 
have to be considered, for example from the World Bank and Global 
Environment Facility. Land developers also need to meet more of 
the clean-up cost before they are granted the rights to use and sell 
brownfield sites. Even then, the costs will probably be too high to 
remediate properly, so China must consider a range of other funding 
instruments, from environmental taxes and clean-up subsidies to 
loan guarantees and insurance. 

The new plan does not specify which technologies should be used 
for soil remediation, and this is a problem. At present, many pro- 
jects try to shorten the treatment period and reduce costs by remov- 
ing just the polluted topsoil, and so the contamination remains in 
deeper ground and water. The removed topsoil 
is landfilled, which just moves the problem 
somewhere else, or incinerated, which releases 
contaminants into the air. Soil-washing is an 
option, but it generates secondary wastes that 
require additional decontamination. Improper 

econtamination measures can even make a site 
worse by releasing buried pollutants. This seems 
to be what happened at the Changzhou school. 

Low-cost and effective new technologies are 
urgently needed. Plant-based bioremediation 
approaches have been developed to treat arse- 
nic and cadmium pollution in China. More 
research on other biological approaches (use of 
earthworms, nematodes and phytoremediation) should be launched. 

In common with previous environmental efforts in China, suc- 
cess of the soil action plan will depend on the strength of the law 
that supports it. Legislation in this area is outdated, and a replace- 
ment has been promised by 2020. It should include firm criteria 
and measures to determine the effectiveness of remediation. (At 
present, some areas in China use guidelines prepared by the US 
Environmental Protection Agency, which often do not properly 
apply to local circumstances). 

China is not alone in facing risks caused by soil pollution. Simi- 
lar major problems have occurred in other developing countries 
undergoing rapid urbanization, including India and Brazil. Asa 
world laboratory for new technology to decontaminate brown- 
fields, China’s approach to managing the problem could benefit 
other countries. By 2050, another 2.5 billion people are expected 
to be living in cities. The brownfields must be treated effectively to 
give these new urban residents clean land. = 


Hong Yang is a researcher at the Norwegian Institute of Bioeconomy 
Research (NIBIO) and the University of Oslo, Norway. 
e-mail: hongyanghy@gmail.com 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


Longing for home 
undid cave bears 


When looking for a spot to 
hibernate, ancient cave bears 
stuck with family. 

Cave bears (Ursus spelaeus) 
went extinct 24,000 years 
ago, whereas the related 
brown bears (Ursus arctos) 
still thrive. To explain these 
differing fates, a team led 
by Gloria Fortes and Axel 
Barlow at the University of 
Potsdam, Germany, obtained 
mitochondrial genomes of 
31 cave bears and 15 ancient 
brown bears from caves in 
Spain. Cave bears from the 
same caves tended to share 
mitochondrial DNA, which 
is inherited maternally, 
whereas brown-bear caves 
contained multiple maternal 
lines. 

These patterns suggest 
that cave bears nearly always 
hibernated in their native 
caves, which may have 
contributed to their demise 
in the face of competition 
from other bears, the 
scientists say. 

Mol. Ecol. http://doi.org/bpgr 
(2016) 


| ECONOMICS 
Satellites map 
world poverty 


Computer scientists have 
used satellite imagery and 
machine-learning techniques 
to make detailed maps of 
regions where poverty is 
common. 

Neal Jean and his colleagues 
at Stanford University in 
California focused on Nigeria, 
Tanzania, Uganda, Malawi 
and Rwanda and combined 
various data sets, including 
daytime images that identify 
features such as paved roads 
and metal roofs, to estimate 
local household consumption 


CELL BIOLOGY 


See-through rodents 


An imaging technique lets scientists peer through the skin of 
a whole mouse or rat to examine its organs after death. 

Ali Ertiirk of the Ludwig Maximilians University of 
Munich in Germany and his colleagues created a technique 
called ultimate DISCO (uDISCO), which removes 
pigments and lipids from the tissues of dead animals using 
an organic solvent. This leaves the organs and skin intact 
but transparent, while preserving genetically encoded 
fluorescent proteins. The method revealed the nervous 


system of a mouse in stark detail. 


uDISCO also shrinks bodies by up to 65%, making it 
possible to image whole animals using light-sheet microscopy, 
which excels at imaging smaller samples. 
Nature Methods http://dx.doi.org/10.1038/nmeth.3964 (2016) 


and income. When identifying 
areas where incomes are below 
the international poverty 

line of US$1.90 per person 

per day, the team’s algorithm 
outperforms night-light 

maps (an alternative but 
limited indicator of economic 
activity). It also probes hard- 
to-reach areas, including 
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regions not accessed by 
household surveys — such as 
those conducted by the World 
Bank — which are costly and 
infrequently conducted. 

The method could prove 
useful for targeting social 
programmes and determining 
when and where they fail. 
Science 353, 790-794 (2016) 


| GENOMICS 
Medicine less 
precise for some 


A lack of ethnic diversity 
in people whose genomes 
have been sequenced is 
complicating precision 
medicine for people with 
non-European ancestry. 
David Goldstein of 
Columbia University in 
New York City and Slavé 
Petrovski of the Royal 
Melbourne Hospital in 
Australia examined data 
from the Exome Aggregation 
Consortium (ExAC), 
which contains sequences 
from 60,252 people, 60.9% 
of whom have European 
ancestry. When they 
compared genetic variants 
from a cohort of 5,094 people 
with variants found in the 
ExAC and other data sets, 
the comparisons yielded a 
shorter list of potentially 
causative variants in people 
with European ancestry (6.6, 
on average) than in people 
with non-European ancestry 
(9.9-12.7 candidate variants, 
depending on ethnicity). 
Precision medicine is 
much more precise for people 
with European than non- 
European ancestry owing 
to undersampling of non- 
European populations, the 
authors write. 
Genome Biol. http://doi.org/bphp 
(2016) 


PARTICLE PHYSICS 


Neutrino search 
closes in 


Scientists are getting closer 
to discovering whether 
neutrinos and antineutrinos 
are in fact the same particles 
— known as Majorana 
neutrinos. 

The theory, proposed 
by Italian physicist Ettore 
Majorana in the 1930s, could 
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explain why neutrinos have 
mass and why the Universe 
contains more matter than 
antimatter. Azusa Gando at 
Tohoku University in Sendai, 
Japan, and her colleagues 
in the KamLAND-Zen 
Collaboration carried out the 
most sensitive search so far for 
radioactive decay indicative of 
Majorana neutrinos, using an 
underground detection facility 
containing a huge balloon 
filled with purified xenon. 

The team’s results, although 
negative, constrain the 
upper limit of the mass 
of Majorana neutrinos to 
61-165 millielectronvolts. 
However, the detector’s 
sensitivity must be pushed 
even further to prove 
Majorana’s theory, the 
researchers say. 
Phys. Rev. Lett. 117,082503 
(2016) 


MATERIALS SCIENCE 


Bulk production of 
mother-of-pearl 


Artificial mother-of-pearl 
can be made by mimicking 
the natural process of 
mineralization. 

Mother-of-pearl, or nacre, 
is remarkably strong yet 
biodegradable. However, its 
complex layered structure, 
in which mineral plates 
form in an organic scaffold, 
makes it difficult to recreate 
in bulk. Shu-Hong Yu at 
the University of Science 
and Technology of China in 
Hefei and his colleagues built 
their own matrix by growing 
sheets of ice, which squeezed 
a solution of the biopolymer 
chitosan into solid layers. 
They then pumped this 
scaffold with materials to 
grow calcium carbonate, and 
pressed the stack to form 
synthetic nacre. 

The synthetic version has 
similar mechanical properties 
to its natural counterpart and 
takes just two weeks to grow. 
This method could be used to 
produce materials for use in 
the aerospace industry or as 
armour, say the authors. 
Science http://doi.org/bpk2 
(2016) 


ASTRONOMY 


Dark-matter 
evidence weakens 


A survey of X-ray light from 
galaxy clusters has found 
no evidence of dark matter 
decaying, in the latest in a 
series of contradictory results. 
In 2014, two separate 
teams found an unexpected 
bump in the energy 
spectra of dozens of galaxy 
clusters. Emissions at 
3.55 kiloelectronvolts (keV) 
were seen as a possible sign of 
the decay of ‘sterile’ neutrinos 
with a mass of 7.1 keV. 
Physicists have hypothesized 
that these heavier cousins 
of the three known types 
of neutrino are possible 
components of dark matter. 
Now astronomer Florian 
Hofmann at the Max Planck 
Institute for Extraterrestrial 
Physics in Garching, 
Germany, and his team have 
analysed publicly available 
data from the Chandra X-ray 
Observatory concerning 
33 galaxy clusters (11 of 
which were not included in 
the original studies). Their 
search found no evidence 
of an anomalous bump at 
around 3.55 keV. 
Astron. Astrophys. 592, A112 
(2016) 


Early humans were 
picky dressers 


Ancient clothing is 
rarely preserved, but two 
independent teams have 
discovered what early humans 
wore to cope with the cold 
European weather. 

Mark Collard at Simon 
Fraser University in Burnaby, 


RESEARCH HIGHLIGHTS MiiiSaiaa¢ 


Canada, and his colleagues 
compared the animals that 
modern indigenous groups 
used to make cold-weather 
clothing with the bone types 
found at early human and 
Neanderthal sites. Remains 
from animals with fur, such as 
foxes and rabbits, were more 
common at early-human 
sites, whereas bones from 
deer, bovids and several other 
animals were found at both 
types of site equally. This 
suggests that early humans 
used fur to sew specialized 
cold-weather apparel, but 
that Neanderthals relied on 
simpler animal-skin capes, the 
authors say. 

In a separate paper, Niall 
O’Sullivan at the Institute for 
Mummies and the Iceman in 
Bolzano, Italy, and his team 
sequenced mitochondrial 
DNA from garments worn by 
Otzi, the 5,300-year-old ice 
mummy. His coat, leggings 
(pictured left) and loincloth 
were made from the skins of 
domestic cattle, sheep and 
goats, whereas his hat and 
quiver (pictured right) used 
brown-bear fur and roe-deer 
skin. 

J. Anthropol. Archaeol. http:// 
doi.org/bn82 (2016); Sci. Rep. 6, 
31279 (2016) 


New dolphin 
species found 


A fossilized dolphin skull in 
the Smithsonian collection has 
been identified as an entirely 
new species 65 years after it 
was dug out of the ground in 
Alaska. 

Alexandra Boersma and 
Nicholas Pyenson of the 
Smithsonian Institution's 
National Museum of Natural 
History in Washington DC 
identified a 23-centimetre- 
long skull (pictured) as a new 
genus and species in a family 
called the Allodelphinidae. 
The extinct animal is closely 
related to today’s South Asian 
river dolphin (Platanista 
gangetica). The fossil dates 
from around 25 million 
years ago, a few million years 
after cetaceans diverged into 


toothed whales and filter- 


feeding baleen whales, and is 
one of the earliest examples 
found of the former group. 

It is also the most northern 
specimen yet discovered of 
the Allodelphinidae, and the 
researchers have dubbed the 
animal Arktocara yakataga — 
the face of the north. 

PeerJ 4, e2321 (2016) 


Stem cells predict 
drug safety 


Heart muscle cells derived 
from individual patients’ stem 
cells could be used to test the 
safety of a drug before it’s 
administered — a boon for 
precision medicine. 

Elena Matsa and Joseph 
Wu of Stanford University 
in California and their 
colleagues made heart 
muscle cells from induced 
pluripotent stem cells derived 
from seven people. They then 
exposed the muscle cells to 
one of two drugs that have 
been linked to heart problems 
in some people: rosiglitazone 
and tacrolimus. The results 
showed differences in how the 
cells responded to the drugs: 
one cell line, for example, 
showed signs of increased 
stress after treatment that 
were not observed in cells 
from other patients. 

The approach could 
one day be used to tailor 
treatment regimens to 
individuals and to test drug 
candidates for potential 
toxicity before they enter 
clinical trials. 
Cell Stem Cell http://dx.doi. 
org/10.1016/j.stem.2016.07.006 
(2016) 
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SEVEN DAY 


HEALTH 


Poor diet plan 
British health campaigners 
have pulled to pieces a UK 
government plan to tackle 
childhood obesity. The plan 
includes a levy on high- 
sugar soft drinks. But the 
final version, unveiled on 

18 August, is significantly 
weaker than some researchers 
had hoped. Among the 
critics is the British Medical 
Association, which said that 
the government had “rowed 
back” on promises to crack 
down on the problem and 
instead produced a “weak 
plan” with “pointless” 
voluntary targets. 


Feverish campaign 
A logistically challenging 
emergency vaccination 
campaign, launched last week, 
aims to stop the spread of 
deadly yellow fever in Angola 
and the Democratic Republic 
of the Congo. Working with 
the countries’ health ministries 
and 56 global partners, the 
World Health Organization is 
coordinating the vaccination 
of 14 million people in 

more than 8,000 locations, 
including urban areas and 
hard-to-reach border regions. 
Since December, more than 


NUMBER CRUNCH 


237% 


The share of ‘bee-friendly’ 
garden plants purchased 
in the United States that 
contain neonicotinoid 
pesticides, which are 
highly toxic to bees. Until 
two years ago, more than 
half of garden plants sold 
by large US retailers were 
pretreated with the bee-toxic 
insecticides. 

Source: Friends of the Earth 


The news in brief 


Villagers vote to leave for good 


People in the Alaskan village of Shishmaref 
voted on 16 August to abandon their tiny island 
northeast of the Bering Strait and move onto 
the mainland because of erosion due to global 
warming and rising sea level. The Inuit village, 


400 people in the region have 
died in the worst outbreak of 
the mosquito-borne disease 
for 30 years. 


Hidden Zika risk 


A medical report from Brazil 
confirms that Zika virus can 
be transmitted through a 
blood transfusion. Two people 
received platelet transfusions 
in January from a blood donor 
who showed no symptoms 

at the time of donation, but 
who was later found to have 
been infected with Zika virus. 
The two recipients harboured 
Zika virus RNA that was 
genetically related to the virus 
found in the donor, although 
they showed no symptoms of 
infection. The findings add 

to evidence showing another 
mode of transmission for Zika, 
which has been shown to be 
passed from mother to child 
in utero, and between sexual 
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partners, say the doctors who 
reported the case on 17 August 
(I. J. E Motta et al. N. Engl. 

J. Med. http://doi.org/bppb; 
2016). 


| ERESEARCH 
Free space for all 


NASA announced on 

16 August that it is granting 
free public access to any 
published research funded by 
the agency. Research data and 
peer-reviewed publications 
by NASA-funded scientists 
will be available for download 
and reading on the agency's 
PubSpace portal within one 
year of publication. PubSpace 
was created in response to a 
2013 request by the White 
House Office of Science and 
Technology Policy. Other 

US agencies, including the 
National Institutes of Health 
and the Food and Drug 


containing around 600 people, has already 

been affected by erosion and flooding, which is 
expected only to worsen in coming decades. The 
unofficial vote was 89-78 in favour of moving, 
but there are as yet no funds to pay for relocation. 


Administration, are making 
their research available 
through the same portal. 


Patent thriller 

Fresh accusations have 
rekindled the battle over 

who invented the potentially 
lucrative CRISPR-Cas9 
gene-editing technology. 
Inan e-mail released by the 
University of California 

(UC) on 15 August as part 

of a pending patent case, 
Shuailiang Lin, a former 
student at the Broad Institute 
in Cambridge, Massachusetts, 
claims that, contrary to the 
Broad’s claims, the lab built 

its gene-editing technique on 
the back of discoveries made 
at UC Berkeley. The Broad 
denied Lin’s claims and notes 
that he used the same e-mail to 
apply for a job at UC Berkeley. 
The letter was first reported by 
the MIT Technology Review. 


ANDREW BURTON/GETTY 
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SOURCE: GWEC 


POLICY 


Greener lorries 


The US government 
announced stricter fuel- 
economy standards for 
lorries, buses and vans on 

16 August. Although heavy- 
duty vehicles make up only 
about 5% of traffic in the 
United States, they account 
for more than 20% of fuel 
consumption in the transport 
sector, and contribute a 
similar proportion of carbon 
dioxide emissions. The 
standards for new vehicles 
will be introduced gradually, 
becoming tighter each year 
over the next decade. 


EVENTS 


Maiden voyage 

The world’s largest aircraft 
made its maiden flight in the 
United Kingdom on 17 August. 
The 92-metre-long Airlander 
10 (pictured) incorporates 
lighter-than-air technology, 
combining characteristics of 

an airship and an aeroplane. 

It took off from Cardington 
Airfield in Bedfordshire and 
performed a circuit of the 

area before landing safely 20 
minutes later. Its developer, 
Hybrid Air Vehicles in Bedford, 
says that the hybrid can travel at 
a cruise speed of 148 kilometres 
per hour and stay airborne for 
up to 5 days. The flight marked 
the start of 200 hours of flight 
testing. Airlander 10 is intended 


to be used for surveillance, 
communication, aid delivery 
and even passenger travel. 


Record hot July 


July this year was the warmest 
month since systematic global 
temperature records began in 
1880. It was 0.1 degrees warmer 
than the previous hottest 

Julys, in 2015, 2011 and 2009, 
according to a monthly analysis 
of global temperatures by 
scientists at NASA's Goddard 
Institute for Space Studies in 
New York City. July was the 
tenth record-setting warm 
month in a row, so 2016 looks 
set to end up as the warmest 
year in 136 years of modern 
record-keeping. 


Mind the data 


Psychologists can now share 
their data and early results 
with colleagues before formal 
publication. On 15 August, 


a comparative study of 
personality traits in 8,600 US 
students was the first paper 

to be deposited in PsyArXiv, 
a new preprint server for the 
psychological sciences (K. 

S. Corker and B. Donnellan 
Preprint at PsyArXiv http:// 
osf.io/xeg7y; 2016). Following 
the example of the successful 
physics server arXiv, similar 
online repositories were 
launched earlier this year 

for the social sciences and 

for engineering. A preprint 
service for chemists will be up 
and running soon. 


PEOPLE 


Berkeley head quits 
Nicholas Dirks has resigned 

as chancellor of the University 
of California, Berkeley. He 

did not give a specific reason 
for his decision, announced 

on 16 August, but he had 


EUROPE’S PUSH FOR OFFSHORE WIND 


TREND WATCH | 


The UK government’s backing 
of a planned 1.8-gigawatt wind 
farm off the Yorkshire coast 
cements Britain's position as 

the world’s leading producer of 
offshore wind power. More than 
90% of global capacity is installed 
off the coasts of 11 European 
countries. Most of the remaining 
10% is installed off China. In 
2015, offshore wind accounted 
for 24% of total wind-power 
installations in the European 
Union. Globally, offshore wind 
represents only about 3% of 
installed wind capacity. 


Europe leads the world in installed offshore wind power, with the 
United Kingdom and Germany contributing the most. 
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been criticized for how he 
managed the university budget 
and handled allegations of 
sexual harassment against 
faculty members and staff 

— including complaints 

filed last year against the 
astronomer Geoffrey Marcy. 
Marcy stepped down from his 
position in October 2015, after 
the complaints came to light. 


Russian gambit 


On 19 August, Russian 
President Vladimir Putin 
appointed church historian 
Olga Vasilyeva as the country’s 
science and education 
minister. Vasilyeva succeeds 
Dmitry Livanov, who will 
become presidential envoy 
for trade and economic 
relations with Ukraine. 
During his four-year term as 
minister, Livanov oversaw 

a radical overhaul of the 
Russian Academy of Sciences, 
Russia’s main basic-research 
organization. Vladimir 
Ivanoy, a vice-president of the 
academy, welcomed Livanov’s 
replacement. Vasilyeva, 
formerly in charge of public 
education in religion and 
history in the presidential 
administration, told Russian 
news agency Interfax that 
religion will not interfere with 
her workas education and 
science minister. See http:// 
go.nature.com/2c1zbq1 for 
more. 


Smallpox fighter 
Donald Ainslee Henderson, 
head of the successful 
campaign to wipe out 
smallpox, died on 19 

August, aged 87. Henderson 
headed the World Health 
Organization's global smallpox 
eradication campaign between 
1966 and 1977. The experience 
caused him to question the 
feasibility of other disease- 
eradication efforts. Smallpox 
remains the only human 
disease ever to be wiped out, 
although decades-long efforts 
to eliminate polio and Guinea- 
worm disease may now be 
nearing completion. 
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Nearby star hosts planet 


Earth-sized world orbiting Proxima Centauri could harbour water — and life. 


BY ALEXANDRA WITZE 


roxima Centauri, the star closest to the 
Pp Sun, has an Earth-sized planet orbiting 

it at the right distance for liquid water 
to exist. The discovery, reported this week 
in Nature’, fulfils a longstanding dream of 
science-fiction writers — a potentially habit- 
able world that is close enough for humans to 
send their first interstellar spacecraft to. 

“The search for life starts now,” says 
Guillem Anglada-Escudé, an astronomer at 
Queen Mary University of London and leader 
of the team that made the discovery. 

Humanity’s first chance to explore this 
nearby world may come from the recently 
announced Breakthrough Starshot initiative, 
which plans to build fleets of tiny laser- 
propelled interstellar probes in the coming 
decades. Travelling at 20% of the speed of light, 


they would take about 20 years to cover the 
1.3 parsecs from Earth to Proxima Centauri. 

Proxima’s planet is at least 1.3 times the mass 
of Earth. The planet orbits its red-dwarf star 
— much smaller and dimmer than the Sun — 
every 11.2 days. “If you tried to pick the type of 
planet youd most want around the type of star 
youd most want, it would be this,” says David 
Kipping, an astronomer at Columbia Univer- 
sity in New York City. “Tt’s thrilling.” 


GRAVITATIONAL HINTS 

Earlier studies had hinted at the existence of 
a planet around Proxima. Starting in 2000, 
a spectrograph at the European Southern 
Observatory (ESO) in Chile looked for shifts 
in starlight caused by the gravitational tug of an 
orbiting planet. The measurements suggested 
that something was happening to the star every 
11.2 days. But astronomers could not rule out 
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whether the signal was caused by an orbiting 
planet or another type of activity, such as 
stellar flares. 

In January 2016, Anglada-Escudé and 
his colleagues launched a campaign to nail down 
the suspected Proxima planet. ESO granted 
their request to observe using a second planet- 
hunting instrument, on a different telescope, 
for 20 minutes almost every night between 
19 January and 31 March. “As soon as we had 
10 nights it was obvious,’ Anglada-Escudé says. 

The team dubbed the work the ‘pale red 
dot campaign, after the famous ‘pale blue dot’ 
photograph taken of Earth by the Voyager 1 
spacecraft in 1990. Because Proxima is a red- 
dwarf star, the planet would appear reddish or 
orangeish, perhaps bathed in light similar to 
the warm evening tints of Earth. 

Although the planet orbits at a distance that 
would permit liquid water, other factors 
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> might render it unlivable. It might be 
tidally locked — meaning that the same 
hemisphere always faces the star, which 
scorches one side of the planet while the 
other remains cool. The active star might 
occasionally zap the planet with destruc- 
tive X-ray flares. And it’s unclear whether 
the planet has a protective, life-friendly 
atmosphere. 

Proxima belongs to the triple-star system 
Alpha Centauri. In 2012, a Nature paper 
reported that an Earth-mass planet orbited 
another member of that stellar trio, Alpha 
Centauri B’. That result has now mostly 
been dismissed**, but exoplanet specialists 
say that the Proxima claim is more likely to 
hold up. “People call me Mr Sceptical, and 
I think this result is more robust,’ says Artie 
Hatzes, an astronomer at the Thuringian 
State Observatory in Tautenburg, Germany. 

This time, the combination of new obser- 
vations and older measurements dating 
back to 2000 increases confidence in the 
finding, Anglada-Escudé’s team argues. 
“It’s stayed there robustly in phase and 
amplitude over a very long time,” says team 
member Michael Endl, an astronomer at 
the University of Texas at Austin. “That's a 
telltale sign of a planet.’ The data also con- 
tain hints that a second planet might exist, 
orbiting Proxima somewhere between 
every 100 and 400 days. 

The researchers now hope to learn 
whether the Proxima planet’s pass across 
the face of its star can be seen from Earth. 
Such a ‘transit’ could reveal whether the 
planet has an atmosphere. A team led by 
Kipping has been independently look- 
ing for transits around Proxima, and is 
frantically crunching its data in search of 
a signal. 

The discovery of the Proxima planet 
comes at a time of growing scientific inter- 
est in small planets around dwarf stars, 
says Steinn Sigurdsson, an astrophysicist at 
Pennsylvania State University in Univer- 
sity Park. NASAs Kepler space telescope 
has shown that rocky planets are common 
around such stars, which themselves are the 
most common type of star in the Galaxy. 
“This is a total vindication of that strategy,” 
he says. 

One day, the Proxima planet might be 
seen as the start of a new stage in planetary 
research. “It gives us the target and focus 
to build the next generation of telescopes 
and one day maybe even get to visit,” 
says Kipping. “It’s exactly what we need to 
take exoplanetary science to the next 
level.” m SEE NEWS & VIEWS P.408 
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INTELLECTUAL PROPERTY 


Personalized 
medicine takes hit 


US Supreme Court decisions seem to drive patent rejections. 


BY HEIDI LEDFORD 


ejections of US patents in categories 
R related to personalized medicine have 
spiked after Supreme Court decisions 
tightened the rules for such claims, an analysis 
of more than 39,000 patent applications reveals. 
The data, presented on 11 August at the 
Intellectual Property Scholars Conference in 
Stanford, California, address patent applica- 
tions in eight categories that commonly include 
personalized-medicine patents. They show 
that following a key Supreme Court decision 
in 2012, the US Patent and Trademark Office 
(USPTO) was nearly four times more likely 
to deem subjects of 


such applications “Personalized 
unpatentable — and medicineis 
applicants were less probably the 
than half as likely poster child 
to overcome those of what we 
rejections. think should be 
“The change in incentivized by 
office actions was patents.” 


absolutely striking,” 

says Nicholson Price, who studies intellectual 
property at the University of Michigan Law 
School in Ann Arbor. “The data are very clear 
that the patent office has changed its behav- 
iour” 

Over the past decade, the Supreme Court has 
used a series of patent cases to clarify what the 
USPTO should consider patentable. Natural 
phenomena and abstract ideas, for example, 
are not patentable, according to section 101 of 
the US patent code. The court has attempted to 
distinguish between these categories and true 
inventions. 

Two of those Supreme Court cases touched 
directly on the biomedical industry. In 2012, 
the Mayo Collaborative Services v. Prometheus 
Laboratories, Inc. decision struck down two 
patents on medical diagnostics, and in the 2013 
Association for Molecular Pathology v. Myriad 
Genetics ruling, the court threw out patents 
on gene sequences used to assess cancer risk. 
In the wake of those decisions, many lawyers 
predicted that patents on inventions that are 
important to personalized medicine — par- 
ticularly, diagnostic tests that could match indi- 
viduals to a particular therapy — would be hard 
to come by, potentially driving away investors. 

Legal scholar Bernard Chao of the University 
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of Denver in Colorado decided to find out 
just how big the impact has been. Chao sifted 
through around 85,000 records of USPTO 
actions taken on more than 39,000 patent 
applications, and sorted out those that had been 
rejected for not meeting the requirements of 
section 101. 

He found that, last year, 22.5% of those patent- 
office actions were rejections because of 
section 101, compared with only 5.5% in 2011, 
the year before the Mayo decision. Applicants 
were also less likely to overcome those rejec- 
tions in the wake of the Mayo decision: before 
Mayo, 70.7% of the section 101 rejections 
were successfully overcome. After Mayo, that 
proportion dropped to 29.7%. 

But Chao notes that there are caveats to his 
analysis: the categories he examined omit some 
personalized-medicine patents and contain 
other kinds of patents as well. In the future, he 
hopes to take a closer look at individual patent 
applications, and to learn more about whether 
certain applications are more likely to get 
through than others. 

Those analyses will be key to finding out 
how patent applicants are adapting to the new 
requirements, says Price. “Patent attorneys are 
clever,’ he says, and may have learnt how to 
construct their patents to avoid conflict with 
the recent decisions. 

Others have documented a clear effect of the 
Supreme Court’s patent decisions on software 
patent applications. But some have cheered that 
change, Chao adds. Software patents are con- 
troversial, and some scholars have argued that 
such patents cause more harm to the industry 
than help it. Personalized-medicine patents, 
however, tend to get more support: “Person- 
alized medicine is probably the poster child 
of what we think should be incentivized by 
patents.” 

Ultimately, it will be difficult to unravel 
what impact the patent decline is having on 
the personalized-medicine industry, cautions 
Arti Rai, a legal scholar at Duke University in 
Durham, North Carolina. The sector is facing 
challenges from several sides: the US Food and 
Drug Administration has proposed tougher 
regulations, and insurance companies have 
been reluctant to pay for new diagnostic tests. 

“Diagnostics start-ups are not in a good 
space right now, that’s clear,’ Rai says. “But how 
much of that is due to Mayo is less clear.” m 
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The Large Hadron Collider has not yet found phenomena outside the standard model of particle physics. 


Who will build the 


next LHC? 


Uncertainty in particle physics complicates collider plans. 


BY ELIZABETH GIBNEY 


many were keen for a piece of the action. 
The discovery of the Higgs boson in 2012 
using the world’s largest particle accelerator, 
the Large Hadron Collider (LHC), prompted 
a pitch from Japanese scientists to host its 
successor. The machine would build on the 
LHC’s success by measuring the properties of 
the Higgs boson and other known, or soon- 
to-be-discovered, particles in exquisite detail. 
But the next steps for particle physics now 
seem less certain, as discussions at the Inter- 
national Conference on High Energy Physics 
(ICHEP) in Chicago on 8 August suggest. Much 
hinges on whether the LHC unearths phenom- 
ena that fall outside the standard model of par- 
ticle physics — something that it has not yet 
done but on which physicists are still counting 
— and whether China’ plans to build an LHC 


> 
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L was a triumph for particle physics — and 
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successor move forward. 

When Japanese scientists proposed host- 
ing the International Linear Collider (ILC), a 
group of international scientists had already 
drafted its design. It would collide electrons 
and positrons along a 31-kilometre-long track, 
in contrast to the LHC, which collides protons 
in a27-kilometre-circumference circular track 
at Europe's particle-physics laboratory, CERN, 
near Geneva, Switzerland. 


CLEANER COLLISIONS 
Because protons are composite particles made 
of quarks, collisions create a mess of debris. 
The particles that would be used in the ILC, 
by contrast, are fundamental and so provide 
cleaner collisions more suited to precision 
measurements, which could reveal deviations 
from expected behaviour that point to physics 
beyond the standard model. 

For physicists, the opportunity to carry out 
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detailed studies of the Higgs boson and the 
heaviest, ‘top’ quark, the second most-recently 
discovered particle, is reason enough to build 
the facility. Japan’s Ministry of Education, 
Culture, Sports, Science and Technology 
(MEXT) was expected to make a call on whether 
to host the project — which could begin experi- 
ments around 2030 — in 2016. But the panel 
advising MEXT indicated last year that oppor- 
tunities to study the Higgs boson and the top 
quark would not on their own justify building 
the ILC, and that it would wait until the end of 
the LHC’ first maximum-energy run — sched- 
uled for 2018 — before making a decision. 

That means the panel is not yet convinced 
by the argument that the ILC should be built 
irrespective of what the LHC finds, says 
Masanori Yamauchi, director-general of Japanis 
High Energy Accelerator Research Organiza- 
tion (KEK) in Tsukuba, who sat on an ICHEP 
panel ata session on future facilities. “That's the 
statement hidden under their statement?” 

Ifthe LHC discovers new phenomena, these 
would be further fodder for ILC research 
— and would strengthen the case for building 
the high-precision machine. 

US physicists have long backed the con- 
struction of a linear collider. And Yamauchi 
says that a joint MEXT and US Department 
of Energy group is discussing ways to reduce 
the ILC’s costs, which are now estimated at 
US$10 billion. A reduction of around 15% is 
feasible — but Japan will need funding com- 
mitments from other countries before it for- 
mally agrees to host, he added. 


CHINESE COMPETITOR 
Snapping at Japan's heels is a Chinese team. In 
the months after the Higgs discovery, physicists 
led by Wang Yifang, director of the Institute of 
High Energy Physics in Beijing, floated a plan to 
hosta collider in the 2030s, also partially funded 
by the international community and focused 
on precision measurements of the Higgs and 
other particles. Circular rather than linear, 
this 50-100-kilometre-long electron—positron 
smasher would not reach the energies of the 
ILC. But it would require the creation ofa tun- 
nel that could allow a proton—proton collider 
— similar to the LHC, but much bigger — to be 
built at a hugely reduced cost. 

Wang and his team this year secured 
around 35 million yuan (US$5 million) in 
funding from China’s Ministry of Science > 
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> and Technology to continue research and 
development for the project, Wang told the 
ICHEP session. Last month, China’s National 
Development and Reform Commission turned 
down a further request from the team for 
800 million yuan, but other funding routes 
remain open, Wang said. The team now plans 
to focus on raising international interest in 
the project. 

By affirming worldwide interest in Higgs 
physics, the Chinese proposal bolsters the 
case for building the ILC, says Yamauchi. But 
if it goes ahead, it could eventually drain inter- 
national funding from the ILC. “It may have a 
negative impact,’ he says. 


SUPER-LHC 
In the future, the option to use China’s 
electron—positron collider as the basis for a giant 
proton collider could interfere with CERN’s 
own plans for a 100-kilometre-circumference 
circular machine that would smash protons 
together at more than 7 times the energy of the 
LHC (see ‘World of colliders’). Until the mid- 
2030s, CERN will be busy with an upgrade that 
will raise the intensity — but not the energy — 
of the LHC’s proton beam. By that time, China 
might have a suitable tunnel that could make it 
harder to get backing for this ‘super-LHC’ 

At ICHEP, Fabiola Gianotti, CERN’s director- 
general, floated an interim idea: souping up the 


WORLD OF COLLIDERS 


Physicists around the world are designing a range 
of particle colliders that are much bigger than the 
Large Hadron Collider at CERN, Europe’s 
particle-physics laboratory. 


— Proton collider 
= Electron-positron collider 


CERN-hosted Large 

Hadron Collider 

2009-35 

Energy: 14 teraelectronvolts (TeV) 


Circumference: 
27 km 


Japan-hosted International 
Linear Collider 
Proposed: 2030 
Energy: <1 TeV 


Length: 31 km 


China-hosted electron-positron 
collider 
Proposed: 2028 


50 or 100 km_ Energy: 0.24 or <0.35 TeV 


China-hosted proton collider 
Proposed: 2030s 
Energy: 70-100 TeV or 


5Oornoolkm 100-140 TeV 


CERN-hosted super 
proton collider 
Proposed: 2035-40 


Energy: 100 TeV 
100 km = 


energy of the LHC beyond its current design by 
installing a new generation of superconducting 


magnets by around 2035. This would provide 
a relatively modest boost in energy — from 
14 teraelectronvolts (TeV) to 20 TeV — that 
would have a strong science case if the LHC 
finds new physics at 14 TeV, said Gianotti. Its 
$5-billion price tag could be paid for out of 
CERN’s normal budget. 

For decades, successive facilities have found 
particles predicted by the standard model, 
and neither the LHC nor any of its proposed 
successors is guaranteed to find new physics. 

Questions from attendees at the ICHEP 
session revealed some soul-searching, including 
a plea to reassure young high-energy physicists 
about the field’s future, and contemplation of 
whether money would be better spent on other 
approaches rather than ever-bigger accelerators. 

The United States is betting on 
neutrinos, fundamental particles that could 
reveal physics beyond the standard model, not 
colliders. The Fermi National Accelerator Lab- 
oratory (Fermilab) in Batavia, Hlinois, hopes to 
become the world capital of neutrino physics 
by hosting the $1-billion Long-Baseline Neu- 
trino Facility, which will beam neutrinos toa 
range of detectors starting in 2026. 

Funding will require approval from US 
Congress in 2017. But at the ICHEP session, 
Fermilab director Nigel Lockyer was confi- 
dent: “We are beyond the point of no return. It 
is happening.” = 
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Barack Obama embraced science, but his policies get mixed reviews. 


Grading America’s 
‘scientist-in-chief’ 


US President Barack Obama sought to map the brain, cool 
the climate and chart a path to Mars. As he prepares to leave 
office, Nature looks back at the scientific highs and lows of 


his presidency. 


BY HEID] LEDFORD, RICHARD 
MONASTERSKY, JEFF TOLLEFSON & 
ALEXANDRA WITZE 


BETTING BIG ON 
BIOMEDICAL SCIENCE 


When president-elect Barack Obama chose 
physicist John Holdren as his top science adviser 
in December 2008, some biomedical research- 
ers worried that the pick signalled a White 
House bias towards physical science. 

Obama quickly put those fears to rest. Within 


weeks of his inauguration, he had overturned 
restrictions on research using embryonic stem 
cells. He has gone on to launch major initiatives 
to map the brain, develop personalized medical 
treatments and cure cancer. 

But faced with a penny-pinching Congress, 
Obamas strong support for biomedical science 
has not translated into significant funding gains 
for the US National Institutes of Health (NIH). 
The agency has seen the purchasing power 
of flat research budgets eroded by inflation 
(see ‘Budget battles’). “The life sciences were 
a significant priority for the Obama adminis- 
tration,’ says Gregory Petsko, a biochemist at 
Weill Cornell Medical College in New York 
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City. “But with Congress being the way that it is, 
there was a limit to what Obama could do as far 
as increasing support of biomedical research.” 
It is the big initiatives that will probably 
form Obama's lasting biomedical legacy, says 
Benjamin Corb, head of public affairs at the 
American Society for Biochemistry and 
Molecular Biology in Rockville, Maryland. In 
2013, Obama announced the Brain Research 
Through Advancing Innovative Neurotech- 
nologies (BRAIN) initiative to map the human 
brain. In 2015, he unveiled the Precision Med- 
icine Initiative, which includes an ambitious 
study of health records and genomic informa- 
tion from one million people in the United 
States. And in January, he introduced the 
Cancer Moonshot, a US$1-billion proposal to 
double the pace of cancer research in five years. 
NIH director Francis Collins, who led the 
Human Genome Project in the 1990s, likens 
Obama to a player who scores three goals in 
the same game: “I said to him, basically, ‘Mr 
President, you have achieved a hat-trick.” 
But such programmes may come at a cost 
to basic research funding, even as they draw 
attention to areas of science that may be over- 
looked or underfunded. “These big initiatives 
tend to cast a really large shadow,’ says Corb. 
“They can overshadow some of those basic 
research needs.” 
And it’s not clear whether Obama’s major 
initiatives will survive under the next presi- 
dent. Democratic presidential candidate > 
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> Hillary Clinton has said that she would 
continue the Cancer Moonshot initiative. She 
also supports Alzheimer’s disease research, 
which bodes well for the BRAIN initiative 
if she is elected, Corb says. Republican can- 
didate Donald Trump has no clear policy on 
biomedical research. 

But the next president wont be making that 
decision alone. Patient advocates drive major 
changes in biomedical research priorities and 
funding over time, and will probably ensure 
that Obama's big-science initiatives continue, 
says Mary Woolley, president of the science- 
advocacy organization Research!America 
in Arlington, Virginia. “Determined advo- 
cates are not going to take ‘no’ for an answer,” 
she says. “They'll be the ones that bridge 
administrations,” 


SPACE RACE STALLS 


Obama tried to shake up the US space 
programme, including NASA’s long-standing 
plan to send people to Mars. But nearly eight 
years — and a series of U-turns — later, he has 
little to show for his effort. 

“Where NASA is today is really not all that 
different from where it was during the last 
presidential transition,” says Marcia Smith, a 
space-policy analyst in Arlington, Virginia, 
who runs SpacePolicyOnline.com. 

A crewed Mars mission remains two 
decades away. Its schedule is constrained by 
the funding available to develop the necessary 
hardware — a new heavy-lift rocket and crew 
capsule to sustain astronauts in deep space. 
That is almost exactly the situation NASA 
was in eight years ago, bar one detail: Obama 
ditched the Moon as a first stop for astronauts 
on their way to Mars. 


That decision, in February 2010, stunned 
NASA, Congress and space-policy experts. 
Obama cancelled the Constellation pro- 
gramme, which his predecessor George W. 
Bush created to send US astronauts back to 
the Moon in preparation for an eventual Mars 
trip. Two months later Obama announced 
a different course: astronauts would visit a 
yet-to-be-chosen asteroid before heading off 
to the red planet. The White House did not 
consult Congress on the switch, angering pow- 
erful members who represent space-industry 
employees in states such as Florida, Texas and 
California. “The hostility created by the way 
the Obama administration rolled that out still 
lingers in Congress,” says Smith. 

The decision also alienated traditional US 
space partners such as Europe and Japan, 
says Scott Pace, director of the Space Policy 
Institute at George Washington University 
in Washington DC. “Little to no weight was 
given to the international implications of the 
decision to abandon efforts to lead an interna- 
tional return of humans to the lunar surface,” 
he says. 

NASA was forced to modify its Mars plan in 
2013, when it became clear that it did not have 
the technology to support astronauts in deep 
space. The White House introduced a contro- 
versial stopgap measure: instead of a crewed 
mission to visit an asteroid, a robot would drag 
an asteroid near the Moon where astronauts 
could then visit it. 

Asteroid scientists have roundly denounced 
the plan, but it is moving forward despite slip- 
ping schedules and ballooning costs. The hard- 
ware for crewed deep-space journeys is also at 
risk of schedule and budget delays, the Gov- 
ernment Accountability Office said last month. 
The heavy-lift rocket is scheduled for its first 


Obama shut down the space shuttle, and encouraged the growth of a commercial space industry. 
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test flight in November 2018, while the crew 
capsule’s is set for August 2021. 

Obama extended US participation in the 
International Space Station for four more 
years, to 2024 — a move generally acclaimed 
by scientists. And he oversaw the shutdown of 
the space-shuttle programme, a process begun 
by Bush. After the last shuttle, Atlantis, flew in 
July 2011, the United States turned to Russia to 
buy rides to orbit for its astronauts. 

NASA is relying on commercial companies 
to fly equipment and — eventually — astro- 
nauts to the space station. The first commercial 
cargo flights began in 2012, and the first astro- 
nauts are scheduled to fly aboard commercial 
spaceships no earlier than 2017. 

Many critics see NASA’ human-spaceflight 
programme as adrift. Eileen Collins, a former 
space-shuttle commander, told the Republican 
National Convention in July that the agency 
needs “visionary leadership again”. 

Scientists grumble about the relative lack of 
flagship missions in development. One of the 
biggest, a proposed mission to Jupiter’s moon 
Europa, has been pushed through not by the 
White House or NASA, but by a Republican 
congressman from Texas who is enamoured 
with the idea of life on icy worlds. 


UNEVEN PROGRESS ON 
SCIENTIFIC INTEGRITY 


Many researchers who watched Obama’s 
inauguration in 2009 were thrilled by his 
pledge to “restore science to its rightful place”. 
But scientists and legal scholars say that, in 
many ways, Obama has failed to live up to that 
lofty promise. 

In general, government researchers have 
enjoyed more freedom — and endured less 
political meddling — than they did under the 
previous president, George W. Bush. Bush’s 
administration was accused of muzzling or 
ignoring scientists on subjects ranging from 
stem cells to climate change. 

In March 2009, Obama instructed agencies 
to develop policies to reduce political inter- 
ference and increase transparency about the 
research used in policy decisions. And when 
the Union of Concerned Scientists (UCS) sur- 
veyed federal researchers in 2015, most said 
that their agency adhered to its scientific- 
integrity policy. 

But critics say that Obama’s White House 
has not shied away from exerting political 
influence over science. 

In 2011, the Environmental Protection 
Agency (EPA) sent a proposal to the White 
House that would strengthen controls on 
ozone pollution, based on guidance from its 
scientific advisers. But Obama directed the 
agency to withdraw the plan, citing the cost of 
the stricter limits at a time when the economy 
was still recovering from a recession. 

And that same year, Health and Human 
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Services secretary Kathleen Sebelius over- 
ruled the Food and Drug Administration’s 
finding that the emergency contraceptive 
‘Plan B One-Step’ was safe to dispense over 
the counter for all women and girls. 

In both cases, science eventually won out: 
the EPA approved stronger ozone stand- 
ards in 2015, and the FDA approved unre- 
stricted sales of Plan B in 2013 after judges 
ruled against the agency. Nevertheless, these 
examples show how political considerations 
have sometimes trumped scientific ones dur- 
ing Obama’s tenure, says Lisa Heinzerling, a 
law professor at Georgetown University in 
Washington DC. 

“There are structures in place that threaten 
scientific integrity and encourage the injection 
of politics into matters that are supposed to be 
scientific or technical,” says Heinzerling, who 
worked at the EPA for two years under Obama. 

Science advocates are concerned about 
how political influence shapes science behind 
closed doors at the White House. The presi- 
dent’s Office of Management and Budget, 
which reviews proposals for new rules and 
regulations, can make substantial changes or 
killa policy without explaining why. “In some 
cases, the White House is messing around, 
and it’s not doing it transparently,’ says Wendy 
Wagner, a law professor at the University of 
Texas at Austin. 

The recent UCS survey revealed room for 
improvement at several agencies. Nearly half 
the scientists at the Centers for Disease Control 
and Prevention said that their agency gave too 
much weight to political interests; that pro- 
portion rose to 73% at the Fish and Wildlife 
Service. And less than 60% of scientists at the 
four agencies surveyed said they could openly 
express concerns about the work of their 
employer without fear of retaliation. 

“We have a lot of new policies and 
procedures in place that are tremendously ben- 
eficial;’ says Gretchen Goldman, a UCS analyst 
who led the study. “But what we're finding is 
that there's more work to be done,’ 


CLIMATE POLICY HOTS UP 


Global warming was one of Obama’s top 
priorities — and one of the most difficult to 
address, given strong opposition from Repub- 
licans in Congress. Yet he managed to help bro- 
ker a global climate accord and push through 
regulations to curb greenhouse-gas emissions 
from cars, trucks and power plants. 

“Obama has established a terrific climate 
legacy,” says David Doniger, who directs the 
climate and clean-air programme at the Natu- 
ral Resources Defense Council, an advocacy 
group in New York. 

The president's earliest actions capitalized 
on the global financial crisis. In February 2009, 
Obama signed economic-stimulus legislation 
that included nearly $37 billion for clean- 
energy research and development (R&D) at 
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US President Barack Obama, who took office in January 2009, pushed to increase 
funding for science agencies. But Congress often rebuffed his proposals. 
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the Department of Energy. Four months later, 
with failing car companies seeking a federal 
bailout, the Obama administration proposed 
higher fuel-efficiency requirements and the 
first greenhouse-gas standards for passenger 
vehicles. The regulations, which took effect in 
2012, will nearly double the average fuel effi- 
ciency of vehicles by 2025, to around 23 kilo- 
metres per litre. 
And after his cam- 


paign fora compre-  “DParisisa major 
hensive climate bill achievement ‘for 
failed in 2010, an the world.” 


emboldened Obama 
used existing laws to 
issue regulations that curbed greenhouse-gas 
emissions, bolstered energy-efficiency stand- 
ards and expanded energy R&D programmes. 

But the president’s big push on climate came 
in advance of the United Nations climate 
summit in Paris in 2015. He committed the 
United States to reduce emissions by at least 
26% below 2005 levels by 2025, and negotiated 
directly with countries such as China to build 
support for a global climate agreement. The 
final version, adopted on 12 December, aims 
to hold average global temperatures to 1.5-2°C 
above pre-industrial levels. 

“Paris is a major achievement for the world,” 
says Robert Socolow, a climate scientist at 
Princeton University in New Jersey. “I don't 
think it would have happened without Obama.” 

Yet Obama's domestic achievements could 
be undone by legal challenges. In February, 
the US Supreme Court temporarily blocked 
a federal regulation to reduce emissions from 


2012 2013 2014 2015 2016 


emissions — could depend on the election in 
November. The Supreme Court is down one 
member and the next president will choose a 
replacement, who could decide whether the 
climate rule stands. 

Some environmental experts say that 
Obama should have pushed harder for a com- 
prehensive climate bill, rather than settling 
for piecemeal regulations. “All of these things 
are actually small bites at the apple that won't 
achieve meaningful emissions reductions over 
time,’ says Catrina Rorke, director of energy 
policy at the R Street Institute, a conservative 
think tank in Washington DC. 

Others criticize Obama for encouraging a 
vast expansion of domestic oil and gas devel- 
opment, even as he sought to wean the country 
off coal and curb its greenhouse-gas emissions. 
“The administration is still trying to have it 
both ways,” says Stephen Kretzmann, execu- 
tive director of Oil Change International, an 
advocacy group in Washington DC. 

Obama rejected the Keystone XL pipeline, 
which would have carried oil from the Cana- 
dian tar sands to US refineries, and has said 
that some fossil fuels should be kept “in the 
ground”. But his administration continues to 
push an ‘all-of-the-above’ energy strategy that 
leads to higher production of domestic fossil 
fuels, Kretzmann says. 

Nonetheless, Obama has helped to change 
the conversation about global warming at 
home and abroad, says Doniger. “The next 
president needs to do more,” he says, “but 
did the Obama administration move the ball 
forward? They sure did.” m 


existing power plants. The fate of that rule 
the cornerstone of Obama’ plan to reduce 


Additional reporting by Sara Reardon 
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BIG PHARMA S 


- COST-CUTTING CHALLENGER - 


A non-profit organization is proving that drug development doesn’t 
have to cost a billion dollars. Can its model work more broadly? 


BY AMY MAXMEN 


irst, there was the pitching and rolling in an old Jeep for eight 
hours. Next came the river crossing in a slender canoe. When 
Nathalie Strub Wourgaft finally reached her destination, a clinic 
in the heart of the Democratic Republic of the Congo, she was 
exhausted. But the real work, she discovered, had just begun. 

It was July 2010 and the clinic was soon to launch trials of a treat- 
ment for sleeping sickness, a deadly tropical disease. Yet it was woefully 
unprepared. Refrigerators, computers, generators and fuel would all 
have to be shipped in. Local health workers would have to be trained to 
collect data using unfamiliar instruments. And contingency plans would 
be needed in case armed conflict scattered study participants — a very 
real possibility in this war-weary region. 

This was a far cry from Wourgaft’s former life as a top executive in the 
pharmaceutical industry, where the hospitals that she commissioned for 
trials were pristine, well-resourced and easy to reach. But Wourgaft, now 
medical director for the innovative Drugs for Neglected Diseases initia- 
tive (DNDi), was confident that the clinic could handle the work. She was 
right. With data from this site and others, the DNDi will next year seek 
approval fora sleeping-sickness tablet, fexinidazole. It would be a massive 
improvement on existing treatment options: an arduous regimen of intra- 
venous injections, or a 65-year-old arsenic-based drug that can be deadly. 

The DNDiis an unlikely success story in the expensive, challeng- 
ing field of drug development. In just over a decade, the group has 
earned approval for six treatments, tackling sleeping sickness, malaria, 
Chagas’ disease and a form of leishmaniasis called kala-azar. And 
it has put another 26 drugs into development. It has done this with 
US$290 million — about one-quarter of what a typical pharmaceutical 
company would spend to develop just one drug. The model for its suc- 
cess is the product development partnership (PDP), a style of non-profit 
organization that became popular in the early 2000s. PDPs keep costs 
down through collaboration — with universities, governments and the 
pharmaceutical industry. And because the diseases they target typically 
affect the world’s poorest people, and so are neglected by for-profit com- 
panies, the DNDiand groups like it face little competitive pressure. They 
also have lower hurdles to prove that their drugs vastly improve lives. 
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Now, policymakers are beginning to wonder whether their methods 
might work more broadly. “For a long time, people thought about R&D 
as so complicated that it could only be done by the biggest for-profit firms 
in the world,’ says Suerie Moon, a global-health researcher at the Harvard 
T.H. Chan School of Public Health in Cambridge, Massachusetts, who 
studied PDPs and joined the DNDi’s board of directors in 2011. “I think 
we are at a point today where we can begin to take lessons from their 
experience and begin to apply to them non-neglected disease,” she says. 

In that vein, the DNDi has started research on alternatives to pricey 
drugs for hepatitis C, and is spearheading an effort to create antibiotics 
for drug-resistant infections, a problem that pharmaceutical companies 
have been slow to contend with. If successful, the work could challenge 
standard assumptions about drug development, and potentially rein in 
the runaway price of medications. “We can’t match our financial figures 
one to one,’ says executive director Bernard Pécoul. “But we believe that 
DNDi can demonstrate that a different model is possible for R&D? 


THE PIPELINE 

When medical charity Médecins Sans Frontiéres (MSF; also known as 
Doctors without Borders) won the Nobel Peace Prize in 1999, its mem- 
bers decried the lack of lifesaving drugs for diseases of the poor, and used 
the Nobel prize money to kick-start the DNDi. Pécoul, a soft-spoken 
Frenchman who had been with MSF for 20 years, took the helm when 
the initiative launched in Geneva, Switzerland, in 2003. Pharmaceutical 
executives were sceptical. Drug development is an expensive, complex, 
decade-long endeavour. “In the early days, we saw DNDias a bit ama- 
teurish,’ recalls Francois Bompart, a medical director at the Paris-based 
drug company Sanofi. “We thought, they cannot be serious.” 

Pécoul and his team started with a safe project. In 2001, the World 
Health Organization had called for malaria drugs that combined ingre- 
dients to slow the spread of resistance to the single best available agent, 
artemisinin. But the poverty of most people who need malaria drugs 
meant that the private sector had little incentive to create and test such 
combination therapies. Pécoul contacted Sanofi, which owned two 
malaria treatments: one based on artemisinin, and the other on the 
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DNDi medical director Nathalie Strub Wourgaft examines a child in Sudan. 


slower-acting amodiaquine. He proposed a deal in which the DNDi 
would pay for and run clinical trials on a pill that combined the two 
drugs. In return, Sanofi would not patent the pill and would sell an 
adult course of treatment for no more than $1, half that for children. “To 
me it sounded very aggressive and not reasonable, since the two drugs 
separately were two to three times that,’ says Bompart. 

But Pécoul convinced Sanofi that the move would be good for the 
company’s public image. He also compromised, allowing Sanofi to stipu- 
late that it could reach the low price gradually. As it turned out, by the time 
the pills were approved in 2007, manufacturing costs had come down 
far enough for the company to 
meet the target price right out 
of the gate. Hundreds of mil- 
lions of pills have since been 
distributed in Africa. All told, 
the project cost the DNDi 
about $14 million, a tiny sum 
in the world of drug develop- 
ment. It has since replicated 
the process to develop other combination therapies (see ‘Discount drugs’). 

Although they improve on existing therapies, some of these combina- 
tions remain inadequate. The DNDis sleeping-sickness therapy NECT, for 
example, reduces a standard treatment from 56 intravenous infusions to 
14. Thatis still problematic in affected countries: clean needles can be hard 
to come by, and long hospital stays are often impossible. People need a pill. 

Drug development from scratch is arduous and expensive. It begins 
with experiments on hundreds of thousands of chemicals in the lab, 
looking for one that kills a pathogen without harming the host. The 
DNDi does not have a laboratory, so it does this through collabora- 
tions. It searches for promising leads in compound libraries generated by 
biotechnology and pharmaceutical companies. Many firms are willing 
to share access to these precious libraries because the diseases that the 
DNDi targets will not result in blockbuster drugs, so it is not infring- 
ing on their turf. The DNDi then contracts high-throughput screening 
centres, such as those at the Institut Pasteur Korea in Seongnam and the 


“WE USE THE SAME TECHNIQUE THAT 
PHARMA DOES, BUT WE DO IT FOR LESS.’ 


University of Dundee, UK, to test them out. “We use the same technique 
that pharma does,’ says Rob Don, director of discovery and preclinical 
research at the DNDi, “but we do it for less.” 

In 2007, such efforts identified fexinidazole, a compound that had 
shown promise against single-celled parasites but was pulled from 
development before reaching clinical trials. The DNDi turned it into 
a tablet, and passed it to its clinical-development team two years later. 

The DNDi approached Sanofi again and promised to take care of 
trials if the company could file for regulatory approval. Sanofi warned 
that human trials would not be easy, because sleeping sickness is not 
common and people who 
get it tend to live in remote, 
unstable regions. But with 
the existing therapies being 
so dreadful, Wourgaft argued 
that any improvements from 
fexinidazole would be clear. 
“The delta between what we 
bring and what exists is huge. 
You dont need a magnifying glass on thousands of patients to see it?” She 
set up multiple small trial sites in the Democratic Republic of the Congo 
and the Central African Republic and pooled their data. 


CLINICAL CHALLENGE 
Wourgaft says that the studies were the hardest she has ever run. In 
addition to logistical challenges, civil war erupted in the Central African 
Republic shortly after the study launched, and rebel groups repeatedly 
robbed a clinic there and threatened the Congolese surgeon leading 
the trial. “I squeeze all my energy into each project; Wourgaft says. “It’s 
as if ’'m using forceps to deliver a baby — and the baby is an elephant.” 
The final trials on fexinidazole conclude this year, and Wourgaft 
is hopeful that the data will earn regulators’ stamp of approval. The 
project has so far cost the DNDi about $45 million — and it stands 
to help 21 million people at risk of the disease in Africa. In a few 
months, Wourgaft will launch another trial, on a completely new oral 
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DISCOUNT DRUGS 


The Drugs for Neglected Diseases initiative (DNDi) has produced several drugs in the past decade for a fraction of what pharmaceutical companies are said to 


spend. Factoring in the cost of failed candidates (not included below), the DNDi estimates that it can develop combination therapies for between US$10 million 
and $45 million, and make a completely new drug from scratch for $110 million to $170 million. 


eae ! ae’ 


COMBINATION THERAPIES 

SSG&PM | Kala-azar (visceral leishmaniasis) 
NECT | Sleeping sickness 

NOVEL DRUGS 


Fexinidazole | Sleeping sickness 
SCYX-7158 | Sleeping sickness 


$5m 
$4.1m 


drug — SCYX-7158 — that may cure people with sleeping sickness ina 
few days. The DNDi estimates that its development up to approval will 
cost around $50 million. 


BREAKING BILLIONS 

For more than three decades, economists at the Tufts Center for the 
Study of Drug Development in Boston, Massachusetts, have collected 
proprietary data from pharmaceutical companies, and used it to calcu- 
late the average cost of developing a new drug. The most recent estimate 
is $1.4 billion. This is used to justify exorbitant drug prices — companies 
must recoup their investments. But many don’t think it has to cost that 
much. Even the chief executive of London-based pharmaceutical giant 
GlaxoSmithKline, Andrew Witty, has called billion-dollar estimates 
“one of the great myths of the industry”. He attributed the huge sums to 
spending too much time on failures. Drug candidates can be killedas a 
result of safety concerns, poor efficacy or profitability worries, and he 
argued that companies could save money by dropping bad leads sooner. 
Others say that the figure is inflated by large and excessive trials done 
to prove that a new drug works just slightly better than an existing one. 

By averaging the cost of projects in its portfolio, the DNDi says that it 
can develop a new drug for between $110 million and $170 million. Like 
the Tufts estimate, these prices include a theoretical cost of failed projects. 

The DNDi admits to enjoying perks that pharma does not have. It keeps 
overhead costs low because its organization is virtual. The research organ- 
izations that it contracts probably charge the group less than they would 
a for-profit company. The DNDialso relies on scientific consultants who 
work for low pay because they relish the chance to make lifesaving drugs 
without considering competitors, investors and marketing. “DNDi gets 
alot for free,’ says Richard Bergstrém, director-general of the European 
Federation of Pharmaceutical Industries and Associations in Brussels. 
“My companies do a lot of pro bono work, and so do universities” 

Still, the organization reckons that such in-kind contributions account 
for just 10-20% ofits expenditure. It saves much more through efficient 
collaboration (avoiding duplicated effort by screening pooled libraries, 
for example) and a focus on desperately needed drugs. Clinical trials can 
be smaller, faster and cheaper when the people who run them don't have 
to struggle to show barely perceptible improvements. And the DNDi kills 
candidate compounds only if they fail on safety or efficacy — it doesn’t 
have to worry about marketability. By contrast, a few for-profit companies 
froze candidate drugs for hepatitis C after Gilead Sciences of Foster City, 
California, brought powerful drugs to the market. “A lot of R&D failures 
in pharma are commercial rather than scientific,” says Don. “We keep 
going until it gets to market or scientifically fails.” 

The DNDi has earned respect from the industry, even though its 
founding organization has been antagonistic to big pharma. “Although 
DNDi came out of MSE, they don’ let ideological viewpoints get in the 
way of making progress,” says Jon Pender, vice-president of government 
affairs at GlaxoSmithKline. He and others praise Pécoul’s skills at nego- 
tiation, and the DNDi’s pragmatic approach to development challenges. 

Policymakers have taken notice, too. Last year, the World Health 
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Early safety and 
proof-of-concept trials 


Not needed for combinations 
of approved compounds 


Late safety and Access and 
efficacy trials additional studies 


$10.5m 
$4m 


$2.5m 
$3.6m 


$13m 
$7.6m 


$32m 
$20.8m 


$17m* 
$17m* 


$62m 
$67m 


*Projected estimates until 2020 


Organization asked the DNDi to consider antibiotics for drug-resistant 
infections in the developing world; in May, the initiative announced 
that it would start the GARD (Global Antibiotic Research and Develop- 
ment) partnership with $2.2 million in seed funding. GARD will start by 
repurposing and combining existing antibiotics to treat a few diseases, 
including gonorrhoea and infections in newborn babies. Marja Esveld, 
a research adviser at the Netherlands ministry of health, is watching it 
closely. “We are worried about the rising costs of pharmaceuticals,’ she 
says, “and so for us, GARD is also a kind of experiment to see if the DNDi 
model can work for the development of drugs in the Western world.” 

Not everyone is convinced. Economist Ramanan Laxminarayan, 
director of the Center for Disease Dynamics, Economics and Policy in 
Washington DC, says that pharmaceutical companies have an incentive 
to make antibiotics for multidrug-resistant infections because patients 
in the United States and Europe will pay to get them — and non-profit 
organizations cannot hope to compete. Once the drugs exist, he says, 
subsidies could ensure that they are affordable. 

Pécoul disagrees: he doesn’t think that subsidies, donations or tiered 
pricing can ensure accessibility. “We need appropriate products and a 
sustainable market for those products,” he says. That environment has 
not materialized for other conditions: Gilead’s hepatitis C drugs, for 
example, are listed at more than $74,000 for a course. And their potency 
against some strains of the virus is questionable, says Pécoul. When he and 
his team learned about other hepatitis drug candidates being frozen, he 
launched a project to turn them into treatments that more people could 
use and afford. They're also attempting to combine existing drugs. 

If the group succeeds with this and with antibiotics, it will have shown 
that its model can be applied to diseases that affect developed countries. 
“T hope we provide lessons that can be used by others,” says Pécoul. But 
companies won't simply adopt the DNDi’s methods, because they do not 
generate profit. The investors who keep firms alive are concerned with 
the bottom line. Pécoul says that a transformation would require govern- 
ment involvement and a reorganization of the development process. It 
would need a system to prioritize what treatments are needed and which 
companies and organizations could collaborate; and it would require fore- 
thought about how the final products would reach those in need. It means 
shifting away from profit-based incentives to things such as prizes and 
government funding. Today’s profit-driven approach is not only expen- 
sive, Pécoul says, it fails huge swathes of the population. 

When Wourgaft reflects on the differences between her career in 
pharma and her work at the DNDi, she thinks not about the cost of 
research and development, but about the value of a human life. She recalls 
one trip to a Congolese sleeping-sickness trial site. She sat on a cot beside 
a woman in the middle of a psychotic episode, and spoke to her desper- 
ate husband. Later, she learned that the woman survived because of the 
DNDi’s treatment. “When you see that, you know the value of what you're 
doing,” she tells me. “We are trying to fix diseases that are lethal — this is 
really serious medicine.” m 


Amy Maxmen is a science journalist in Berkeley, California. 
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New York City’s High Line park, a transformed former rail line. 


Expand the frontiers of 
urban sustainability 


Social equity and global impacts are missing from measures of cities’ environmental 
friendliness, write David Wachsmuth, Daniel Aldana Cohen and Hillary Angelo. 


anhattan skyscrapers, rather than 
Mie rural towns, are quickly 

becoming the picture of sustain- 
able living in the twenty-first century. San 
Francisco, Copenhagen and Singapore each 
top their regions in the Green City Index 
(see go.nature.com/2bxjac9). As sites of 
innovation and economic dynamism, these 
places exemplify a blend of density and 


livability that large, prosperous cities in the 
‘global south, such as Mumbai in India and 
Sao Paulo in Brazil, increasingly emulate. 

A few decades ago, cities were seen as sus- 
tainability problems rather than solutions. 
Then, as concerns about suburban sprawl, 
shanty towns and climate change grew, so 
too did awareness that clustering people in 
energy-efficient buildings and walkable, 


-- 


fia 


shady neighbourhoods makes cities more 
pleasant to live in and better for the global 
environment. 

But the prevailing model of urban 
sustainability is too narrow. Although 
the social, economic and ecological 
issues behind sustainability problems are 
regional or global in scale, urban policy 
usually addresses single ecological 
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> issues in individual neighbourhoods. 
Focusing on dense cities and their affluent 
areas ignores social movements and their 
advocacy for quality-of-life issues such as 
housing and commuting, which have direct 
ecological consequences. Targeting specific 
districts ignores the often negative regional 
and global impacts of 


local environmen- “Even 

tal, or ‘greening’, information 

improvements. in ‘the cloud’ 
Spatially, sustain- hasan 

ability research and eyyironmental 

policymaking should impact.” 


shift focus from 

city centres to urban regions and global 
networks of production, consumption and 
distribution. Socially, policymakers should 
incorporate equity into every stage of the 
urban-policy process, from research to for- 
mulation to implementation. 


NEIGHBOURHOOD WATCH 

From the revitalization of city parks to 
urban bicycle-sharing programmes, 
urban sustainability interventions tend to 
be conceived, implemented and evaluated 
one municipality or neighbourhood at a 
time. Yet urban environmental processes 
occur on much larger scales. Projects that 
benefit one district may have negative 
impacts next door. 

One example is environmental gentri- 
fication. As districts become greener, they 
become more desirable and expensive. The 
premiums placed on neighbourhood ameni- 
ties — such as walkability, public transport 
and the proximity of parks, farmers’ markets 
and ‘greenways’ such as hiking trails and bike 
paths — by residents who can afford to pur- 
sue them raise the cost of living. 

Social displacement can result. Policies 
that encourage these improvements tend 
not to be linked to a broader social-equity 
agenda, so low- and middle-income resi- 
dents are forced into peripheral neigh- 
bourhoods where population densities are 
lower, commutes are longer and environ- 
mental problems are more common. Many 
sustainability gains are simply a regressive 
redistribution of amenities across places. 

For example, in North American cities 
such as New York and San Francisco, poor 
districts have long suffered from the dump- 
ing of industrial-waste, low air quality and 
a lack of green spaces. In recent years, often 
in response to community activism, policy- 
makers have tried to create shadier streets 
and more recreational space, to improve 
public transport and greenway access, and 
to build mixed-use eco-friendly housing in 
such neighbourhoods. New York City has 
made efforts to green East Harlem, west- 
ern Queens and Red Hook in Brooklyn. Yet 
poor people are frequently priced out and 
must move’. 


In Europe, the German city of Freiburg 
has been internationally recognized for its 
achievements in renewable energy, pub- 
lic transport, participatory planning and 
pedestrianized, energy-efficient districts. 
As the metropolitan region has become 
more desirable and expensive, more of its 
workforce has turned to the cheaper sub- 
urbs for housing. The city has grown more 
socially homogenous, while beyond its 
boundaries commuting has skyrocketed, 
as have the associated carbon emissions’. 
Greening has come at the expense of com- 
munity stability and racial and economic 
diversity, and has undermined regional 
environmental goals. 

These patterns hold around the world. 
Studies have shown that in several cities, the 
social costs of climate adaptation fall mainly 
on disadvantaged groups. Examples include 
Medellin, Colombia; Jakarta, Indonesia; 
Dhaka, Bangladesh; and Boston, Massachu- 
setts. Climate-adaptation plans fail to engage 
poor communities and often recommend 
relocating them to unsafe areas where they 
would be more vulnerable to droughts, heat, 
flooding and disease. Meanwhile, wealthy 
residents who set the planning agenda bene- 
fit from new land-use regulations and protec- 
tive infrastructure. From Boston to Dhaka, 
resources earmarked for climate-adaptation 
are concentrated in wealthy districts and the 
risks are exacerbated elsewhere’. 


FARTHER AFIELD 

Post-industrial cities highlight their sustain- 
ability triumphs in terms of building density, 
extensive public-transport networks and 
the presence of knowledge-intensive, high- 
tech firms, all of which drive down locally 


The Golden Gate Bridge in San Francisco. 
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produced pollution and carbon emissions. 
But even high-tech workplaces depend on 
polluting activities elsewhere. Computers 
and smartphones produce growing global 
flows of electronic waste that concentrate 
their toxic by-products — such as the trace 
amounts of beryllium and mercury in 
mobile phones — in poor communities in 
the developing world. Guiyu in China used 
to be a small rice-growing village, but was 
transformed in the 1990s into the world’s 
largest processing zone for electronic waste. 
Local water rapidly became undrinkable*. 

Even information in ‘the cloud’ has an 
environmental impact. Data centres account 
for 2% of global greenhouse-gas emissions; 
their power usage is expected to triple in 
the next decade’. And much financial and 
high-tech activity consists of coordinating 
resource extraction and manufacturing 
activities that have moved to other parts 
of the globe. Apple designs its iPhones in 
California, but 84% of the embodied car- 
bon emissions of the phones come from 
their production in China, South Korea and 
other countries, mostly in Asia. 

The low-carbon footprints prized by 
cities such as San Francisco and Seattle 
are little more than accounting tricks. The 
main method of carbon counting attributes 
to urban areas only the emissions resulting 
from in-city activities and regional power 
plants. Few studies count the full life cycle 
of emissions for all goods and services con- 
sumed by individuals and groups in cit- 
ies, or emissions resulting from air travel. 
Those that do are telling. Consumption- 
based carbon counts for Shanghai, Seat- 
tle, San Francisco and London find more 
than double the per capita emissions of 
standard calculations. Almost 80% of San 
Franciscans’ greenhouse-gas emissions, for 
example, are produced outside the city® (see 
‘Remote impacts’). 

The apparent low-carbon benefits of 
density fall dramatically when income and 
lifestyle are controlled for. Upper-income 
urban residents in the United States and 
Europe tend to consume more imported 
goods and services, fly more often, and 
drive out of the city more often than peo- 
ple living on lower incomes’. In the United 
Kingdom, during the explosion of low- 
cost air travel from the late 1980s to the 
early 2000s, the number of working-class 
passengers flying out of London increased 
by around 60%; wealthy passengers trips 
increased by nearly 150%. 

Although prosperous urban residents 
may commute by bicycle or public trans- 
port — the forms of low-carbon living 
most commonly cultivated by sustainabil- 
ity projects such as Freiburg’s eco-neigh- 
bourhoods — their carbon footprints are 
enlarged greatly by their consumption prac- 
tices and leisure travel. Economic activity and 
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urban density in post-industrial cities are 
inextricably linked with global networks of 
production, consumption and distribution. 


KEY PLAYERS 

It has become conventional wisdom that city 
leaders are more nimble and less ideological 
than their national counterparts. These two 
qualities, the story goes, allow leaders such as 
New York’s former mayor Michael Bloomb- 
erg and former Bogota mayor Enrique 
Pefialosa, along with networks such as the 
C40 Large Cities Leadership Group, to take 
the lead in confronting global sustainability 
challenges — even as international treaty 
efforts and national policymaking stall. 

This ‘urban turn in policy and discourse 
captures important truths. But it obscures 
the fact that municipalities are more nimble 
because they wield less power. Municipal 
governments lack access to industrial policy, 
welfare systems and tax regimes. They have 
limited control over consumption patterns 
and large-scale infrastructure. And cities are 
bound by competitive pressures that pit them 
against each other in the pursuit of capital 
investment and talented workers. Municipali- 
ties thus tend to pursue sustainability policies 
that are also economic-development policies, 
and these disproportionately focus on affluent 
central business districts or residential areas 
designed to attract skilled professionals. 

This challenges, for instance, the good 
intent of the United Nations’ Sustainable 
Development Goals for cities. Reaching 
these goals requires strong national policy 
commitments to new regional infrastructure 
programmes, cash transfers to poor people, 
and local governance reform across urban 
regions. 

State, provincial and national governments 
can apply sustainability policies across local 
jurisdictional lines. In the aftermath of Hur- 
ricane Sandy, which hit the US east coast in 
2012, some of the dozens of small municipali- 
ties on the New Jersey Shore independently 
attempted to build new ‘hard’ seawalls, 
despite concerns that these would displace 
storm surges to their neighbours. Only higher 
levels of government can prevent such ‘beg- 
gar-thy-neighbour local politics. 

And grass-roots groups bring about 
change from the bottom up. Community- 
based organizations, city-wide non-profit 
organizations and ad hoc social move- 
ments shape cities’ built environment and 
lifestyle. But these groups are often over- 
looked in discussions about sustainability 
policy because most of them do not frame 
their work in environmental terms. They are 
more likely to speak of a broader ‘right to 
the city. Advocates for affordable housing 
and mass transit are proposing exactly the 
types of intervention that shrink individu- 
als’ carbon footprints and improve commu- 
nity resilience®. But they are rarely seen as 


REMOTE IMPACTS 


In San Francisco, most of the carbon emissions associated with the consumption of goods by 
residents, firms and governments in 2008 arose beyond the city’s limits — elsewhere in the United 
States or overseas. Yet municipal sustainability initiatives target only the metropolitan area. 
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prospective allies by green policymakers. 

Sustainability efforts that are indifferent 
to concerns about affordability and that lack 
support from community members are less 
just and less likely to succeed. In New York 
City, an effort to implement a congestion 
charge in central Manhattan failed in the face 
of public opposition. New Yorkers in outer 
boroughs viewed the plan as elitist and indif- 
ferent to the concerns of poorer commuters. 
Still, some fledgling coalitions around equity 
and sustainability are emerging. Last year in 
Sao Paulo, a historic drought and state mis- 
management of scarce water resources led 
housing movements and environmentalists 
— long at odds over how to deal with precari- 
ous waterside settlements — to come together 
around a common agenda of housing and 
water justice’, 


NEXT STEPS 

First, urban environmental researchers need 
to supplement neighbourhood-specific and 
city-centric’” measurements, such as walk- 
ability or commuting by public transport, 
with ones that better capture the broader 
dimensions of ecological sustainability and 
social equity. For instance, studies of changes 
to local transit systems should analyse the 
knock-on effects in regional housing and 
labour markets. 

Second, multicity low-carbon policy net- 
works such as the C40 and climate-focused 
organizations such as the World Resources 
Institute in Washington DC should insist on 
—and support — all large cities carrying out 
standardized, consumption-based carbon- 
footprint analyses. As well as providing more 
accurate accounts of specific cities’ carbon 
footprints, this would underscore the extent 
to which emissions levels are correlated with 
class and income. 

Third, policymakers should treat social 
equity and ecological effectiveness as mutu- 
ally reinforcing dynamics in urban sustain- 
ability. They should bring the widest range of 
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social movements to the table and see those 
groups’ demands — suchas revitalizing rent 
regulation and public housing — as central. 
This would entail more frequent meetings of 
larger groups of stakeholders and different 
metrics of policy success. But it would also 
yield more creative, sophisticated and encom- 
passing policies that would have broader pub- 
lic support. 

Only by expanding the spatial and social 
dimensions of urban policymaking can it be 
made truly sustainable and equitable. m 
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Cliff Robertson as the title character in Charly, the 1968 film adaptation of Flowers for Algernon. 


IN RETROSPECT 


Flowers for Algernon 


Ananyo Bhattacharya looks back at a science-fiction 
touchstone on the ethics of experimental biology. 


y the time science-fiction writer 
B Daniel Keyes died in 2014 at the age of 

86, he had lived through vast upheav- 
als in biomedical science, from the discovery 
of the DNA double helix to the sequencing 
of the human genome. But ethical over- 
sight did not always keep pace. Keyes’ novel 
Flowers for Algernon, 50 years old this year, 
highlights how often the need for oversight 
is ignored or flouted. 

A case in point is a 1946-53 study 
conducted by Harvard University and 
the Massachusetts Institute of Technol- 
ogy, and sponsored in part by food con- 
glomerate Quaker Oats. Dozens of boys 
with learning difficulties at the Walter E. 


Flowers for Fernald State School in 
Algernon Waltham, Massachu- 
Sie ~ setts, were fed cereals 
arcourt, Brace pe . : 
World: 1966. containing radioactive 


tracers to track how 

they absorbed iron and 
calcium. The boys were told only that they 
were joining a science club, and consent 
forms sent to their parents made no men- 
tion of radiation exposure. A US Depart- 
ment of Energy committee concluded in 
1994 that it was “extremely unlikely” that 
the boys had been harmed by the radiation, 
but the disregard for their human rights is 
breathtaking. Other experiments, including 
some sanctioned by the US government, 
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were much more egregious. Hundreds of 
African-American men involved in the 
Tuskegee Syphilis Study in Alabama from 
1932 to 1972 were never told that they had 
the disease; nor were they treated, despite 
the availability of penicillin from the 1940s. 

The Tuskegee ‘experiment’ would never 
happen today, but the Massachusetts study’s 
more subtle transgressions — in failing 
to fully regard the participants as ends in 
themselves, rather than a means to achieve 
the researchers’ ends — remain relevant. It 
is this suppression of feeling for people and 
laboratory animals in the pursuit of scientific 
knowledge that Keyes captures in Flowers for 
Algernon. 

Keyes’ novel, based ona short story that he 
published in 1959, follows 32-year-old Char- 
lie Gordon, who agrees to have an experi- 
mental brain operation that may help him 
to overcome his severe learning difficulties 
and increase his intelligence (he has an IQ 
of 68). The only subject to have previously 
undergone the procedure successfully is a lab 
mouse named Algernon. After the operation, 
Charlie’s IQ rises rapidly; he soaks up new 
languages and knowledge of the arts and sci- 
ences. His journal entries, which make up the 
novel, chart his growing awareness of his own 
sexuality and emotions, particularly his feel- 
ings for his former teacher at the Beekman 
College Center for Retarded Adults. 

More revealing of Keyes’ intent is the 
evolving relationship between Charlie and 
Algernon. At first resentful of Algernon’s 
superior intellect (the mouse easily beats 
him at navigating a maze), Charlie develops a 
strong bond with his fellow experimental sub- 
ject. At the height of his genius, Charlie begins 
to investigate the experiment to advance the 
work. Soon realizing that it has flaws, he kid- 
naps Algernon to protect him. The regression 
that ends the book is so crushing that five 
publishers rejected the manuscript before it 
found a home. Flowers for Algernon became 
a best-seller (more than 5 million copies have 
been sold so far) and was adapted for the hit 
1968 film Charly, starring Cliff Robertson. It 
still features in bioethics discussions. 

Keyes had a degree in psychology and 
would later become a professor of creative 
writing at Ohio University in Athens. In 
between, he edited pulp magazine Marvel 
Science Stories and worked at Atlas Com- 
ics, the precursor to Marvel Comics. He also 
briefly taught English in New York City’s 
public-school system. The empathy that 
suffuses the novel stems from his experience 
of teaching children with learning difficul- 
ties. When one student returned to classes 

after a long absence, 


> NATURE.COM Keyes noted that he 
Formoreonscience = had forgotten how to 
in culture see: read. “He had lost it 
nature.com/ all? Keyes said. “It was 
booksandarts a heartbreaker.” His 
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sympathy for Algernon seems to stem in part 
from dissecting a female mouse at university: 
Keyes was shaken when his incisions revealed 

“a cluster of tiny fetuses” in its uterus. 
Despite his compassion for experimental 
subjects, human and animal, Keyes does not 
portray researchers as the evil geniuses of 
cultural cliché. Writing before modern ideas 
of informed consent were fully established in 
the late twentieth century, Keyes portrays the 
careerist psychologist Harold Nemur, who 
leads the trial, taking pains to get permission 
from Charlie’s relatives to carry out the pro- 
cedure. Neurosurgeon Jayson Strauss, who 
performs the operation, is concerned about 
Charlie’s well-being throughout. What exer- 
cises Keyes is his scientists’ failure to imagine 
Charlie as a whole human being before his 
intelligence-enhancing operation. Whereas 
Charlie's apprecia- 


“What exercises tion of Algernon's 
Keyes is his personhood only 
lontiets* grows, Nemur is 
fate unable to view 
fe vasaonibe to é Charlie as anything 
imagine Charlie other than a sort of 
as awhole es benign Franken- 
human being. stein’s monster. 


That hubris is 
sometimes evident today, when research- 
ers fail to reflect fully on the consequences 
of their work (S. Aftergood Nature 536, 
271-272; 2016). A crop of findings suggests 
that the well-being of laboratory rodents 
has not been sufficiently prioritized. For 
example, mice are housed at around 20°C, 
cooler than their preferred temperature of 
30°C (see Nature http://doi.org/bnh7; 2013). 
Many lab animals are also overweight. As 
well as being bad for their welfare, there is 
evidence that such conditions may skew 
experimental results (Nature 464, 19; 2010). 

This year, plans to make a synthetic 
human genome were criticized when dis- 
cussions between more than 100 scientists 
took place behind closed doors and did not 
focus sufficiently on the proposal’s ethi- 
cal implications (Nature 534, 163; 2016). 
Another controversy centred on the widely 
used HeLa cell line, derived in 1951 from 
the cervical tumour that killed an African 
American woman, Henrietta Lacks. But she 
had never consented to such use. In 2013, 
the cell-line genome was published — with- 
out permission from Lacks’s living relatives. 

As the world enters the era of genome 
editing, it is tempting for scientists to 
monopolize the ethical debate once more. 
To avoid that temptation, researchers could 
do worse than turn to Keyes’ astonishing 
Flowers for Algernon, a work that, tellingly, 
has never been out of print. = 


Ananyo Bhattacharya is science 
correspondent at The Economist in London. 
e-mail: ananyobhattacharya@economist.com 
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Books in brief 


The Cyber Effect 

Mary Aiken JOHN MurRAY (2016) 

In this incisive tour of sociotechnology and its discontents, forensic 
cyberpsychologist Mary Aiken has much to say about children and 
the digital world. Parents addicted to mobile phones, for instance, 
fail to give babies the ‘face time’ they need to develop non-verbal 
communication skills; and the UK Association of Teachers and 
Lecturers has linked toddlers’ tablet use with delays in speaking. With 
“compulsion loops” built into online games, and cybercommunities 
focused on extreme behaviours luring people in through online 
disinhibition, it’s time for industrial accountability, she argues. 


Dr James Barry: A Woman Ahead of Her Time 

Michael du Preez and Jeremy Dronfield ONEWORLD (2016) 

Over an illustrious career, Victorian surgeon James Barry became 
Britain’s inspector-general of military hospitals, performed one of 
the first successful Caesarean sections in Africa and achieved the 
Crimean War’s highest recovery rate. But under the overcoat, Barry 
was Margaret Ann Bulkley, who with the complicity of her mother 
and radical friends defied the rules and studied medicine at the 
University of Edinburgh. Urologist Michael du Preez and writer 
Jeremy Dronfield have drawn on fresh archive material for this 
nuanced biography of a medic with a mind-blowing secret. 
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Spare the Birds! George Bird Grinnell and the First Audubon Society 
Carolyn Merchant YALE UNIVERSITY PRESS (2016) 

From a fashion for feathers to habitat loss, US bird-life in the late 
nineteenth century faced pressing threats, prompting naturalist 
George Bird Grinnell — who had ties to the family of ornithologist John 
James Audubon — to launch a society and magazine in the great 
man’s name. Carolyn Merchant's lavishly illustrated environmental 
history analyses Grinnell’s contribution, from biographical writings 

on Audubon to delightful field descriptions of birds he portrayed — 
noting, for instance, how cedar waxwings (Bombycilla cedrorum) aid in 
reforestation by excreting undigested cherry stones. 


Tastes Like Chicken: A History of America’s Favorite Bird 
Emelyn Rude PEGASUS (2016) 
Andrew Lawler’s 2014 Why Did the Chicken Cross the World (Atria; 


Tastes ike see Nature 515, 490-491; 2014) gave us a natural and cultural 
(; : " history of Gallus gallus domesticus, from its south Asian origins to 
<Oby global ubiquity. In a breezy narrative brimming with retro recipes, 


culinary historian Emelyn Rude focuses on the history of US chicken 
consumption, currently 8.6 billion birds a year. From New York 
immigrants’ foul “ornithological parks” of the 1880s and 1890s to the 
rise in global demand — which can push production at the expense of 
animal welfare — Rude reveals chicken as a troublesome taste. 


Virus: An Illustrated Guide to 101 Incredible Microbes 

Marilyn Roossinck \vy (2016) 

Polio, Ebola, influenza — it’s the viral villains that hit headlines, yet a 
number of viruses are benign. Environmental microbiologist Marilyn 
Roossinck sets the record straight with this stunning explication of 
101 viruses that infect everything from humans to archaea. Along 
with basics on life cycles, transmission and more, Roossinck offers 
succinct descriptions, schematic drawings and a gallery of electron- 
microscopy images that have more than a passing resemblance to 
the paintings of Jackson Pollock and Wassily Kandinsky. Barbara Kiser 
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Correspondence 


Don’t let climate 
crush coral efforts 


June's International Coral Reef 
Symposium brought together 
more than 2,500 influential 
people who work on coral reefs, 
yet discussion centred on solving 
the global-scale issue of climate 
change, following a worldwide 
coral-bleaching event (Nature 
http://doi.org/bdmn; 2015). In 
our view, the symposium missed 
an important opportunity 

to develop real conservation 
outcomes for coral reefs at a 
local scale (see J. E. Cinner et al. 
Nature 535, 416-419; 2016). 

Discussions on climate 
change seem unproductive for 
environmental managers and 
scientists on the ground. Few 
individuals have a platform for 
engaging with global political 
leaders to drive the conservation 
agenda and influence policies 
that affect climate trajectories. 
Instead, we should be working 
together to develop strategies for 
local action that are robust to the 
uncertainty surrounding future 
climate scenarios. 

We shall have to differentiate 
between those uncertainties that 
we can resolve at a local scale, 
such as the benefits of reducing 
overfishing or inputs of sediment 
and nutrients, and those that 
we cannot. To conserve coral 
reefs, we need objectives that 
can be turned into cost-efficient 
actions to deliver measurable, 
uncertainty-proof, local benefits. 
Jennifer McGowan, Hugh 
P. Possingham University of 
Queensland, St Lucia, Australia. 
Ken Anthony Australian Institute 
of Marine Science, Townsville, 
Queensland, Australia. 
j.-megowan@ugq.edu.au 


Social changes affect 
water quality too 


Researchers need a better 
understanding of the effects of 
social shifts on river basins and 
water catchments, in addition 
to the impacts of climate change 
(A. Michalak Nature 535, 


349-350; 2016). To help safeguard 
water-catchment services against 
these social changes, communities 
should become more involved in 
water-management issues. 
Urbanization and 
industrial and agricultural 
developments all generate 
changes in legislation, policy 
and demographics. These can 
adversely affect water-catchment 
services, which provide social, 
cultural and environmental 
benefits such as flood defence, 
recreational space, geodiversity 
and increased biodiversity. 
Catchment-management 
initiatives, such as improving 
drinking-water quality and 
offsetting flood risk, are 
estimated to cost more than 
£100 billion (US$130 billion) 
over the next 15 years in England 
alone. By engaging with these 
initiatives, local communities 
can contribute to their 
implementation and ensure that 
they are cost-effective. 
Alec Rolston Dundalk Institute 
of Technology, Ireland. 
alec.rolston@dkit.ie 


Stop marginalizing 
rare syndromes 


Rare medical disorders are 
extremely challenging for 
patients and their families, and 
for researchers trying to study 
them. As a scientist and father of 
a 41-year-old son with CHARGE 
syndrome, which affects the 
heart, ears, eyes and other organs, 
I believe that we need stronger 
commitment, a more consistent 
approach and different types of 
knowledge and skills to move 
these investigations forward. 

Promising initiatives include 
the International Rare Diseases 
Research Consortium and 
European Union projects on rare 
disorders under Horizon 2020. 
We still urgently need to improve 
and speed up genetic diagnosis, 
and to understand and mitigate 
the effects of these disorders on 
people’s health. 

This calls for an 
interdisciplinary strategy, but it 
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faces significant methodological 
challenges. In a small study of 

81 participants, for example, 

we found that those with 

Downs, Williams or Prader- 
Willi syndromes all had health 
problems related to diet and 
inactivity; also, there were 
important differences between 
disease groups and between 
women and men (M. Nordstrom 
et al. Food Nutr. Res. 59, 25487; 
2015). Although such differences 
need to be underpinned by 

a wider evidence base from 
international collaborations, they 
indicate that a more sophisticated 
and personalized approach to 
care is paramount. 

People with rare syndromes 
lack the political power of larger 
patient groups, so advances 
depend on holistic scientific 
insights and on funders 
overcoming their reluctance to 
support marginalized research. 
Svein Olav Kolset University of 
Oslo, Norway. 
s.0.kolset@medisin.uio.no 


Venezuela’s brain 
drain is accelerating 


Your interview with the president 
of the Latin American Academy 
of Sciences, Claudio Bifano, 
barely reflects the scale of the 
scientific crisis in Venezuela 
(Nature 535, 336-337; 2016). 
Its academic brain drain, for 
instance, is worse than indicated. 

The figures you quote for 
scientists leaving Venezuela are 
from a preprint we released at 
the end of last year. Since then, 
the number has swollen rapidly 
from 1,504 to 1,820 — up from 
1,783 in July, when the full 
paper was published (J. Requena 
and C. Caputo Interciencia 
41, 444-453; 2016). The latest 
tally represents almost 15% 
of Venezuela's scientists, who 
account for some 33% of its 
research publications. This rapid 
loss is coupled with the stalled 
recruitment of new talent. 

This is in lamentable contrast to 
the end of the last century, when 
the research community grew by 


200 or so Venezuelan scientists a 
year. Hugo Chavez took over as 
president in 1999 and, in my view, 
16 years of disastrous science 
policies followed. 

Jaime Requena Academy of 
Physical, Mathematical and 
Natural Sciences, Venezuela. 
requenaj@gmail.com 


Strengthen China’s 
flood control 


Heavy rainfall in China's Yangtze 
River basin as a result of the 
longest and strongest El Nifto 
event for 65 years has led to severe 
flooding and economic losses of 
almost US$10 billion. Massive 
investments in flood defences 
after the 1998 deluge, which killed 
more than 4,000 people, proved 
inadequate. New tactics could 
help boost China’s flood control. 
Alongside better levees, 
enlarged reservoirs and improved 
early-warning systems, disaster- 
risk analysis can guide strategies 
for managing disasters (see S. L. 
Cutter et al. Nature 522, 277-279; 
2015). Disaster risk depends on 
the degree of hazard, exposure 
and vulnerability (Y. Zhou et al. 
Risk Anal. 34, 614-639; 2014). 
Accurate hazard assessment 
calls for a better understanding of 
extreme weather events and their 
rising frequency and intensity. 
Exposure calculations should 
factor in different population 
densities across the region. And 
vulnerability estimates should 
note the efficacy of early-warning 
systems and the resilience of local 
infrastructure (J. Birkmann et al. 
Nat. Hazards 67, 193-211; 2013). 
Speedy access to and effective 
dispersal of reliable disaster-risk 
information and of disaster- 
relief professionals will help to 
prevent and mitigate catastrophic 
outcomes, including secondary 
disasters such as landslides. 
Yang Zhou, Yansui Liu IGSNRR, 
Chinese Academy of Sciences 
(CAS); and Beijing Normal 
University, China. 
Wenxiang Wu IGSNRR, CAS, 
Beijing, China. 
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OBITUARY 


Alfred G. Knudson 


(1922-2016) 


Cancer geneticist whose insights launched the search for tumour-suppressor genes. 


fter years of observing 
Ae with the rare eye 

cancer retinoblastoma, 
Alfred Knudson proposed an expla- 
nation for how two different forms 
of it arise. His ‘two-hit’ hypothesis 
led to the realization that the loss of 
gene function, not just the activa- 
tion of a cancer-causing gene, could 
cause cancer. 

Knudson, who was born in Los 
Angeles, California, in 1922, died 
on 10 July, aged 93. After complet- 
ing a bachelor of science degree at 
the California Institute of Tech- 
nology (Caltech) in Pasadena in 
1944, he earned a medical degree 
from Columbia University in New 
York City in 1947, and a PhD in 
biochemistry and genetics in 1956, 
also at Caltech. Knudson then 
spent years treating children in 
medical centres in California and 
New York. 

During the 1950s and 1960s, 
cancer epidemiologists were pre- 
occupied with trying to understand 
the environmental causes of the dis- 
ease. In a 1953 paper, cancer biologist C. O. 
Nordling noted that in developed nations, 
the incidence of cancer seemed to increase 
with age (C. O. Nordling Br. J. Cancer 7, 
68-72; 1953). Nordling’s proposal that the 
occurrence of cancer needed the accumu- 
lation of at least six sequential mutations 
was ultimately proved wrong. But his idea 
that cancer is caused by a certain number 
of ‘hits’ to the genome paved the way for 
Knudson’s key insight. 

Knudson had the foresight to focus on 
inherited tumours in childhood, which 
were relatively easy to study. The tumours 
could be counted and the early occurrence 
of the disease meant that there were fewer 
confounding factors to grapple with, such 
as the random genetic mutations that 
occur throughout life. 

During his years in the clinic, Knudson 
had noticed that children with the heredi- 
tary form of retinoblastoma often devel- 
oped multiple tumours in both eyes. By 
contrast, people with the ‘sporadic form 
developed a single tumour in only one eye. 
Also, in cases of hereditary retinoblastoma, 
the tumours typically occurred before 
the child was five; in sporadic cases, they 
occurred later in development. 


On the basis of these observations, 
Knudson proposed that in hereditary ret- 
inoblastoma, one copy of the gene involved 
is mutated in the germ line (in reproduc- 
tive cells such as eggs and sperm) and the 
other copy is mutated in somatic (non- 
reproductive) cells during the first few 
years of life, thus the cancer forms earlier. 
And because the germline mutation affects 
all somatic cells, these children are more 
prone to developing multiple tumours 
in both eyes. He argued that people who 
develop the other form are born with two 
normal alleles, both of which must become 
mutated in two ‘sporadic events in somatic 
cells, so the cancer develops later in life. 

Knudson published his hypothesis in 
1971 (A. G. Knudson Proc. Natl Acad. Sci. 
USA 68, 820-823; 1971). He subsequently 
applied the same logic to other inherited 
tumours, such as Wilms’ tumours (a type 
of kidney cancer) and those of the adrenal 
glands. 

In 1983, cancer geneticist Webster 
Cavenee, then at the University of Utah in 
Salt Lake City, proposed that the genetic 
‘hits’ in Knudson’s mathematical models 
must be recessive, because the develop- 
ment of cancer happens only when both 


———— 
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gene copies are mutated or lost. 
Using a technique called restriction 
fragment length polymorphism, 
Cavenee compared the DNA of 
tumours to that in normal tissues 
taken from people with retinoblas- 
toma. He showed that the loss of 
heterozygosity (caused by a loss of 
the second, previously unaffected 
allele) led to cancer. (W. K. Cavenee 
et al. Nature 305, 779-784; 1983). 

Knudson’s two-hit cancer hypoth- 
esis had a huge impact. Until this 
point, cancer was thought to be 
caused by the activation of onco- 
genes. Now the search was on for 
tumour-suppressor genes — whose 
loss of activity or function causes 
cancer. Knudson won the 1998 
Albert Lasker Clinical Medical 
Research Award for his work on the 
genetic basis for cancer. 

Knudson made other significant 
contributions through his leader- 
ship of one of the oldest cancer 
centres in the United States: the 
Fox Chase Cancer Center in Phila- 
delphia, Pennsylvania. Joining in 
1976, he spent 40 years there. He served as 
president (1980-82), scientific director 
(1982-83) and director of the centre’s 
Institute for Cancer Research (1976-82). 

The achievement of which he was most 
proud was giving Irwin Rose, a biochemist 
who joined the centre in 1963, US$50,000 
so that Rose could extend the stay of two 
visiting scientists from Israel. Rose and 
these scientists, Avrum Hershko and 
Aaron Ciechanover, won the Nobel Prize 
in Chemistry in 2004 for discovering 
ubiquitin-mediated protein degradation. 
Cells use this process to break down and 
recycle protein; it has aided the develop- 
ment of several cancer drugs. 

Alfred was a very supportive and 
approachable mentor, whose lack of 
patience for science that merely repeated the 
work of others kept everyone in his sphere 
striving for the new. He will be missed. m 
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Figure 1 | The octobot. Wehner et al.’ have made an octopus-shaped robot that is constructed completely from soft materials. The body houses a liquid-fuel supply 
and a fluidic system that controls a cyclic pattern of leg movements. Actuators that cause the legs to lift are visible as purple rectangles in the legs. Scale bar, 10 mm. 


Generation soft 


Meet the octobot, the first robot to be made entirely from soft materials. Powered by a chemical reaction and controlled by a 
fluidic logic circuit, it heralds a generation of soft robots that might surpass conventional machines. SEE LETTER P.451 


BARBARA MAZZOLAI & VIRGILIO MATTOLI 


obots are typically used in manufactur- 
R ing contexts that involve well-structured 

environments. These situations allow 
them to move following predefined pro- 
cedures, limiting interactions with human 
operators for safety reasons. But if these 
machines were moved into ‘real’ environments 
outside factories, they would have to cope 
with uncertain situations, react and adapt to 
changing conditions, and interact safely with 
living organisms, including humans’ — tough 
problems to solve using conventional technol- 
ogy made from hard materials. Robots made 
from soft, deformable materials” would be 
better able to grasp and manipulate unknown 
objects, and to move on unstructured and 
rough terrains, and might be less hazard- 
ous to people. On page 451, Wehner et al.’ 
present the first robot that completely lacks 


rigid structures and control systems. 

Soft body parts are important in many natural 
organisms. Animals such as squid, starfish and 
worms are composed almost entirely of soft 
materials and liquids, which increases their 
adaptability and robustness. There is therefore 
a growing belief that soft materials might help 
robotics technology to go beyond its current 
capabilities, by allowing robots to elongate, 
squeeze, climb and grow. For example, soft 
robotic arms inspired by octopuses can elon- 
gate’, and soft robots that mimic caterpillars 
can roll and jump”. 

A notable attempt to develop a fully soft 
robot was reported’ in 2011 by workers from 
the same research group as that of Wehner and 
colleagues. In that case, the robot itself was 
composed exclusively of soft materials, but 
a conventional pump-and-valve system was 
used to implement (actuate) different types of 
locomotion pneumatically, and was connected 
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to the robot through cables. Wehner and 
colleagues now push the technological bound- 
aries further, because not only are their robot’s 
body and actuation units soft, but so also are 
the control system and power source, which 
are integrated into the robot. This makes it the 
first completely soft robot capable of operating 
without being tethered by cables. 

The octopus-shaped robot — dubbed the 
‘octobot’ by the authors (Fig. 1) — has eight 
arms moved by a pneumatic mechanism 
that relies on the expansion of embedded, 
inflatable compartments working as actua- 
tors. These actuators are integrated into a 
fluidic-pneumatic network powered by a 
liquid fuel (an aqueous solution of hydrogen 
peroxide). The fuel passes through reaction 
chambers that contain a platinum-based cata- 
lyst, which causes the hydrogen peroxide to 
decompose. This decomposition produces 
pressurized oxygen that inflates the actuators, 


L. K. SANDERS, R. TRUBY, M. WEHNER, R. WOOD & J. LEWIS/HARVARD UNIV. 


thus generating the arm movements. 

Wehner et al. control the sequence of the 
octobot’s arm movements using a completely 
soft fluidic circuit based on a system of valves 
that act as elements of logic gates. The circuit 
creates an oscillation that converts the inflow 
of pressurized fuel from the fuel storage cham- 
ber into outflows that alternate between dif- 
ferent reaction chambers, until the system 
runs out of fuel. The octobot therefore repeats 
cycles of movements in which it first lifts four 
of its arms while lowering the other four, and 
then performs the reverse manoeuvre (see 
go.nature.com/2b3cn3s). The whole of the 
robot’s body, including the fluidic circuit, is 
made of silicone-based materials that have 
different mechanical properties, tailored to 
the functional requirements of the various 
subsystems. 

The realization of autonomous soft robots 
will require the integration of different mater- 
ials and functionalities, such as actuation, 
powering and logic; the octobot represents 
the minimal system that demonstrates the 
potential of this approach. To achieve the 
required integration, Wehner and colleagues 
used a combination of advanced fabrication 
techniques — including micro-moulding’, soft 
lithography* and multi-material embedded 
3D printing’ — to produce rubbery structures 
embedded with fluidic channels, spanning 
several orders of magnitude in length scales. 
Despite its apparent complexity, the customiz- 
ability of this fabrication process allowed the 
authors to validate design modifications using 
a quick trial-and-error approach, so that the 
final device was rapidly optimized. 

Wehner and colleagues’ use of soft materials 
and continuum deformations — continuous 
bending of the arms to generate movement, 
rather than motion created by rigid structures 
connected by rotational joints — paves the 
way for further scientific and technological 
developments. The next steps are to develop 
computational control systems (such as 
more-sophisticated fluidic circuits) that allow 
a greater range of movement; to define new 
design rules for soft robots; and to adopt and 
improve manufacturing technologies. 

Other challenges remain. For example, the 
forces that soft robots can exert on the environ- 
ment might be limited, potentially restricting 
their applications. Moreover, the use of fluidic 
logic circuits as control systems, rather than 
conventional electronics, might limit the com- 
plexity of the behaviours that can be generated. 
A greater understanding of the properties of 
soft materials and how they interact with 
control systems and the environment is also 
needed, to produce desired robotic behaviour 
in real contexts”. 

Although soft robotics is still in its infancy, 
it holds great promise for several applications, 
such as servicing and inspecting machinery, 
search-and-rescue operations, and explo- 
ration. Soft robots might also open up new 


approaches to improving wellness and quality 
of life. Soft endoscopes that allow omni- 
directional bending, elongation and tunable 
stiffening are already a reality’', as are soft 
orthotic devices used for ankle and foot 
rehabilitation’. Wehner and colleagues’ find- 
ings might help to guide research in these 
directions, contributing to the pillars of 
knowledge that will support the edifice of this 
new discipline. m 
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Friendly neighbours 
feed tumour cells 


In pancreatic cancer, neighbouring non-cancerous cells degrade their own 
proteins through a process called autophagy and release amino acids that are 
then taken up and used by the cancer cells. SEE LETTER P.479 


JURRE J. KAMPHORST & EYAL GOTTLIEB 


he cancer pancreatic ductal adeno- 

carcinoma (PDAC) has a poor progno- 

sis, and is an area of intense biomedical 
research. Previous studies” have shown that 
PDAC has a growth-promoting effect on 
pancreatic stellate cells in the surrounding 
connective tissue, and these cells reciprocate 
by supporting tumour growth and spread. On 
page 479, Sousa et al. investigate the possibil- 
ity that pancreatic stellate cells directly provide 
tumour cells with nutrients, in studies using 
mice and human pancreatic cells. 

A key feature of PDAC is desmoplasia, in 
which the tumour becomes enmeshed and 
surrounded by dense, scar-like tissue’. As 
well as the tumour cells, desmoplasia involves 
other cell types, including pancreatic stellate 
cells, which are mainly responsible for the 
condition, and immune cells. Desmoplasia is 
a serious medical concern, because it forms a 
barrier to treatment with chemotherapeutic 
drugs as a result of poor blood perfusion into 
the tumour’. Paradoxically, desmoplasia is pro- 
moted by the release of signalling factors by 
the tumour, even though it also impedes the 
tumour’s blood supply and hence its access to 
oxygen, glucose and other nutrients. 

How can tumour cells obtain sufficient 
food and energy in the resulting nutrient- 
poor conditions? Sousa et al. showed that 
when PDAC cells were grown in vitro with 
either pancreatic stellate cells or even in the 


solution in which the stellate cells had been 
previously grown, oxygen consumption by 
the PDAC cells increased. This indicates that 
factors from the stellate cells can stimulate the 
activity of energy-producing organelles called 
mitochondria. Sousa and colleagues found that 
the increased mitochondrial activity is due to 
tumour consumption of the amino acid ala- 
nine, which is released at a high rate by the 
stellate cells. 

The authors used alanine labelled with 
heavy carbon isotopes to trace the fate of 
this imported amino acid in tumour cells. 
Surprisingly, they found that alanine had an 
unusual metabolic fate in PDAC cells beyond 
its normal role in protein synthesis. When 
taken up by the cells, alanine can be metabo- 
lized to pyruvate in the aqueous intracellular 
region known as the cytosol. However, the 
authors found that instead it went primarily 
into the mitochondrial pool of metabolite 
molecules (Fig. 1). 

Pyruvate is a key molecule in the mitochon- 
drial tricarboxylic acid cycle, which generates 
cellular energy and feeds into biosynthetic 
pathways such as lipid synthesis. It is usually 
generated by glucose metabolism in a process 
called glycolysis. Another potential source 
of pyruvate is alanine transamination, in 
which pyruvate is produced by the removal 
of nitrogen from alanine. Alanine transami- 
nation can occur in the cytosol or the 
mitochondria. 

Sousa et al. found that the alanine 
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Figure 1 | Pancreatic stellate cells feed a tumour. Pancreatic ductal adenocarcinoma is a form of 
pancreatic cancer that is characterized by the build-up of dense, scar-like tissue pervading the tumour. 
This limits the delivery of oxygen and nutrients (including glucose) to the tumour cells by blood vessels. 
Sousa et al.° find that, near tumour cells, pancreatic stellate cells degrade their own proteins in a process 
called autophagy and release the resulting amino acids, most notably alanine, which are taken up and 
consumed by the tumour cells, facilitating cancer growth. In the tumour cell, alanine is metabolized to 
produce pyruvate in organelles called mitochondria, where pyruvate is used to produce energy and lipids 
through the tricarboxylic acid (TCA) cycle. The use of this pyruvate generated from alanine allows the 
limited glucose in the cell to be used for other metabolic purposes. 


supplied to pancreatic cancer cells undergoes 
transamination primarily in the mitochondria, 
producing pyruvate that enters the tricarboxy- 
lic acid cycle to provide the cells with energy 
and lipid molecules. Such selective channelling 
ofalanine to these biosynthetic pathways frees 
up glucose in the cell for use in other roles, 
such as production of the amino acid serine, 
which is required for nucleic-acid biosynthesis 
and hence cell growth and division. 

Although Sousa and colleagues’ work 
documents the metabolic support that 
alanine from pancreatic stellate cells can offer 
PDAC cells, it is unclear how and why alanine 
is channelled specifically to mitochondria. 
Alanine-derived pyruvate in the cytosol can 
also be transported to mitochondria for energy 
and biosynthesis purposes, and yet this takes 
place to a lesser degree. Direct transamination 
of alanine to pyruvate in the mitochondria 
seals its metabolic fate as a source of energy 
or biosynthetic molecules. One possible 
explanation for this compartmentalization of 
alanine transamination is the potentially higher 
availability of the nitrogen-acceptor molecule 
a-ketoglutarate in the mitochondria, which 
might enable fast and efficient production 
of pyruvate in response to the large supply of 
alanine from stellate cells. 

Sousa et al. made a surprising finding — that 
a process called autophagy could affect a 
neighbouring cell. Autophagy is a survival 
mechanism used by cells to break down and 
metabolize their expendable proteins, lipids 
and other macromolecules when nutrients are 
scarce. It has been considered to be a process 
that acts only in the cell itself. However, Sousa 


and colleagues demonstrated that inhibiting 
autophagy in pancreatic stellate cells did not 
affect the cells’ growth, but rather abolished 
alanine release and the resulting support of 
PDAC-cell growth both in vitro and in vivo 
(when PDAC cells and pancreatic stellate cells 
were transplanted together into mice). 

The metabolic support of tumour cells by 
autophagy in pancreatic stellate cells provides 
a twist to the emerging tale of metabolic scav- 
enging by PDAC cells. These latter cells have 
a high basal autophagy rate, helping them to 
survive metabolic hardship®’. The previous 
observation® that PDAC-cell scavenging of 
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extracellular proteins such as albumin can 
support PDAC metabolism and growth in 
nutrient-limited conditions highlights the 
tendency of these cancer cells to exploit diverse 
nutrient sources. Extracellular material is 
taken up by the cell through macropinocyto- 
sis, a cell-membrane-based transport system. 
These uptake and autophagy processes depend 
on organelles called lysosomes, and lysosomal 
activity was recently found’ to be increased 
in PDAC. 

Sousa and colleagues’ work provides 
metabolic insight into the well-documented 
crosstalk between cancer cells and their 
neighbours. It has been demonstrated” that 
pancreatic cancer cells send signals to adjacent 
stellate cells that then reciprocate and alter the 
intracellular signalling and metabolism of the 
cancer cells. It will be interesting to learn which 
tumour-generated signals stimulate autophagy 
and alanine secretion from the pancreatic 
stellate cells. m 
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Signal locked in 


A plant receptor protein interacts in an unusual way with the hormone it binds. 
The receptor cleaves the hormone, a fragment of which then binds covalently to 
the receptor and triggers a major receptor shape change. SEE LETTER P.469 


KIMBERLEY C. SNOWDEN & 
BART J. JANSSEN 


receptor and its associated hormone 
A« often thought of as a lock and key, 

in which the hormone key fits perfectly 
into the receptor lock, leading to a biological 
response — with the key ultimately being 
released intact from the receptor. Yao et al.’ 
on page 469 and an accompanying paper in 
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Nature Chemical Biology by de Saint Germain 
et al.” show that the reality is different for the 
plant hormones known as strigolactones. The 
strigolactone receptor protein cleaves strigo- 
lactone and forms a covalent bond between 
the receptor and a cleaved hormone fragment 
called the D-ring. This triggers a dramatic 
change in the shape of the receptor, expos- 
ing surfaces that can interact with signalling 
partners. 


Strigolactones regulate aspects of plant 
development such as the growth of branches 
and the programmed death of leaves**. They 
are exuded by plants into the soil, where they 
stimulate interactions with symbiotic fungi 
that improve nutrient uptake. Strigolactones 
can also trigger the germination of parasitic 
plants, which can result in catastrophic crop 
failure, particularly in sub-Saharan Africa*. 

The evolutionarily conserved strigolactone 
receptor called AtD14 (also known as Arabi- 
dopsis thaliana DWARF 14) is a member of 
the a/B hydrolase superfamily of proteins, 
characterized by a common core that coor- 
dinates three catalytic amino acids (serine, 
histidine and aspartate)**. This superfamily 
includes the enzyme acetylcholinesterase, 
one of the fastest known enzymes, which can 
catalyse 25,000 reactions per second; but it 
also includes an enzymatically inactive recep- 
tor for the plant hormone gibberellin®”. The 
strigolactone receptor is unusual in that it acts 
both asa receptor and as an enzyme’ that can 
cleave its own hormone. Even stranger, the rate 
of cleavage is very slow, a positively torpid rate 
of one reaction every 15-20 minutes’. 

Yao and colleagues’ accomplished the dif- 
ficult task of solving the X-ray crystal structure 
(at a near-atomic resolution of 3.3 angstroms) 
for AtD14 in a complex with the proteins D3 
and ASK1, two components of an SCF com- 
plex involved in strigolactone signal propa- 
gation**. SCF complexes target proteins for 
degradation by another protein complex called 
the proteasome. 

The structure revealed a fragment of the 
hormone covalently bound to two of the 
catalytic amino acids of AtD14. The authors’ 
observations imply that, after the hormone 
strigolactone binds to AtD 14, it is cleaved 
by a hydrolysis reaction catalysed by the 
receptor, and the D-ring hormone fragment 
becomes trapped in the catalytic active-site 
pocket, leading to a conformational change in 
AtD 14 that promotes interaction with the SCF 
complex (Fig. 1). 

The receptor in complex with the D-ring 
fragment and D3 undergoes a shape change 
when compared with structures of the receptor 
alone**""'", with one of the four a-helix domains 
that comprise the receptor’s ‘lid’ becoming 
extended in length, and another a-helix unfold- 
ing to form a loop. This change might explain 
previously observed destabilization of the 
receptor on hormone binding” Tn addition, 
the catalytic active-site pocket is compressed in 
volume from 420 A*to 80 A3, and the entrance 
to the pocket is closed. This observation indi- 
cates that release of the D-ring is unlikely with- 
out break-up of the complex, and could explain 
why the enzyme activity is so slow. 

The greatest surprise from Yao and col- 
leagues’ crystal structure is the observation of 
a covalent bond between the D-ring and the 
receptor, because receptors don't usually bind 
their ligands in this manner. To confirm this 


finding, the authors used mass spectrometry 
to isolate a fragment of the receptor protein 
that showed a mass increase consistent with 
the D-ring bound to the histidine amino acid 
of the active site. This key observation is con- 
firmed independently by de Saint Germain 
et al.’ in the accompanying paper, which inves- 
tigated the strigolactone receptor in pea plants 
(Pisum sativum). This group used fluorescent 
strigolactone analogues to show that the initial 
hydrolysis of these compounds was relatively 
fast, but that turnover to release the D-ring 
seemed to be extremely slow, because release 
was not observed within the experimental time 
period (up to 30 minutes). 

Strigolactone promotes the degradation of 
the strigolactone receptor”, but an unresolved 
question is whether the strigolactone D-ring 
covalently bound to the receptor is released 
before the receptor is destroyed. De Saint Ger- 
main and colleagues’ observations suggest that 
there is no turnover of the D-ring to reset the 
receptor; however, in vitro experiments*” show 
that, given sufficient time, the receptor can 
catalyse more than one reaction, which would 
be possible only if the D-ring is released. In 
addition, Yao et al. used mass spectrometry to 
detect release of the D-ring within an hour in 
in vitro assays. However, more studies, particu- 
larly in vivo experiments, are needed to clarify 
the kinetics of the reaction. 

Yao et al. describe a plant with a mutant 
receptor that has increased enzymatic activity 
but is unlikely to undergo the conformational 
shift. The mutant plant does not respond to 
strigolactone (at least as far as branching is 
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concerned). Meanwhile, de Saint Germain 
et al. show that some substrates that do not 
have biological activity can still be cleaved 
by the receptor; however, these substrates do 
not form the covalent linkage with the recep- 
tor because of the absence of a methyl group 
on the D-ring. Taken together, these findings 
suggest that the enzyme activity of the recep- 
tor is not sufficient for its biological function. 
Instead, the key function of the strigolactone 
receptor is the formation of the receptor- 
SCF complex, which is probably initiated 
by a conformational change in the receptor 
structure**, 

Yao et al. also show that a mutation in the 
strigolactone receptor that affects interaction 
with the SCF-complex protein D3 does not 
affect binding of the receptor to a downstream 
signalling component, which suggests that dif- 
ferent receptor surfaces are involved in these 
interactions. X-ray crystal structures of the 
receptor bound to target proteins would clarify 
whether the receptor acts to present targets to 
D3 for subsequent degradation. 

Although the studies by Yao and colleagues 
and de Saint Germain and colleagues answer 
some questions about this unusual receptor, 
they also raise new questions. We do not yet 
know the sequence in which the strigolactone 
receptor, SCF, and the proteins targeted for 
degradation are assembled into a complex. 
Nor is it known whether receptor enzyme 
activity is fast or slow in vivo, and more work 
is needed to understand what the active form 
of the receptor-D-ring complex is during 
signalling. In particular, does it involve linkage 
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Figure 1 | An unusual hormone-receptor interaction. In the absence of the hormone strigolactone, the 
AtD 14 strigolactone receptor protein has a large, open cavity that contains a catalytic site. This binds and 
hydrolyses strigolactone, then releases the ABC-ringed portion of the hormone, and it has been found 
that the receptor traps the hormone D-ring as a covalently bound intermediate. Yao et al.' identified the 
trapped D-ring intermediate in X-ray crystal structures of AtD14 from the plant Arabidopsis thaliana, 
and de Saint Germain et al.” monitored the hormone interaction with the strigolactone receptor of pea 
plants using fluorescent substrate analogues. AtD14 subsequently undergoes a conformational shift that 
allows the protein D3 to bind the receptor. The proteins D3 and ASK1 form part of the SCF complex, 
which targets proteins for degradation. The receptor-SCF complex adds a molecular tag (not shown) to 
targeted signalling proteins, resulting in their degradation by the proteasome apparatus. If the D-ring 
were released from the receptor, this would reset the receptor to its original state, but whether the 


D-ring is released in vivo is not known. 
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of the D-ring to just the histidine or to both the 
histidine and serine catalytic amino acids? We 
also do not know the extent of conformational 
change in the receptor before D3 binds. 

A broader question is how does the 
structural diversity of the many known 
strigolactones fit into the varied roles of these 
compounds in signalling within and between 
plants and symbiotic fungi? Has this mode of 
receptor action evolved to cope with the vari- 
ation in strigolactones necessary in the arms 
race to evade parasitic plants? Only time will 
tell if this mode of receptor-ligand action is 
used by other receptors. m 
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Fleeting glimpse of 
an elusive element 


A heroic effort to characterize the chemistry of actinium, a short-lived 
radioactive element, reveals surprising differences in behaviour compared 
with other elements in the actinide series. 


THOMAS E. ALBRECHT-SCHMITT 


radioactive elements that have short half- 

lives using state-of-the-art techniques that 
were not available when the elements were 
first discovered. Writing in Nature Commu- 
nications, Ferrier et al.' report just such an 
investigation of the poorly understood heavy 
element actinium (Ac) — the first element in 
the actinide series of the periodic table. They 
reveal that the chemical behaviour of actin- 
ium in water differs from that of the heavier 
actinides, and in so doing provide a clue that 
might enable actinium to be used in cancer 
radiotherapy. 

Like all elements beyond bismuth in the 
periodic table, actinium has no stable isotopes. 
Moreover, the only available isotopes, 225 Ac 
and ~’’Ac, must be extracted from the decay 
of other radioactive elements. The parents 
of these actinium isotopes include several 
isotopes of radium — which is not readily 
available and also decays, to form extremely 
dangerous radon gas — as well as rare iso- 
topes of thorium, uranium and protactinium. 
As a result, the supply of actinium is typically 
limited to only a few micrograms at a time. 

The challenges continue to mount in the 
case of **Ac, because it has a half-life of just 
10 days. To make matters worse, the chemi- 
stry of actinium cannot be interrogated using 
standard spectroscopic techniques, because 
its only oxidation state is +3, and the associ- 
ated ion, Ac*’, has a ‘closed-shell’ electron 
configuration — rendering it invisible to 


lE is now possible to study the chemistry of 


those techniques. The combination of intense 
radioactivity and spectroscopic invisibility 
makes efforts to tackle actinium chemistry 
nothing short of heroic. 

Understanding the fundamental chemistry 
and physical properties of elusive elements is a 
worthy enterprise in itself, but there is another 
reason for the interest in actinium: Ac has 
been recognized’ as a promising candidate for 
treating cancer through a process called tar- 
geted alpha therapy. This therapy relies on the 
fact that **Ac decays by emitting a-particles 
— which consist of two protons and two neu- 
trons, and are one of 
the most common 
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ing radioactive decay 
(B-particles and y-rays), but they can usually 
travel through only the first few layers of cells 
in tissue’. This makes them ideal for cancer 
therapy, because it means they can destroy 
cancer cells without damaging the surround- 
ing tissue. 

However, to develop radioisotopes for this 
kind of treatment, molecules that trap the iso- 
tope’s ion and deliver it to specific areas of the 
body are needed. Such molecules are called 
chelating agents, and must have remarkably 
high binding specificity for the radioisotope 
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being delivered — otherwise, the isotope 
could be displaced from the chelating agent 
by other, biologically available metals, such as 
iron, when the isotope enters the bloodstream. 
Herein lies the rub: because the chemistry of 
actinium is sufficiently different from that of 
other elements with +3 oxidation states (tri- 
valent elements), we do not have appropriate 
chelating agents for actinium. Moreover, we 
cannot design them without knowing how 
molecules and ions bind to Ac** ions in water. 
The goal of Ferrier and colleagues’ work was 
to uncover why actinium’s chemistry devi- 
ates from that of other trivalent elements in 
the lanthanides and actinides, the two series 
of elements with which actinium is grouped 
in the periodic table. 

Because many actinides are scarce and radio- 
active, molecules that probe their reactivity 
must be developed using analogous, non- 
radioactive elements. This often takes years 
of effort. Fortunately, trivalent actinides can 
have chemistry similar to that of trivalent 
lanthanides, of which only promethium is 
radioactive. More specifically, a lanthanide 
ion can typically be found that has a similar 
ionic radius to that of a target actinide, and 
can therefore act as the actinide’s substitute 
during efforts to explore chemical reactivity, 
preventing the depletion of precious radio- 
nuclide resources. Even when the origin of the 
differences between actinides and lanthanides 
is being studied, lanthanides provide a good 
first approximation of the chemical behaviour 
of actinides. 

However, finding a lanthanide ion of similar 
size to Ac** presents a challenge, because the 
ionic radius of actinium, although not estab- 
lished with a high degree of precision, must be 
much larger than that of lanthanum (La, the 
largest lanthanide). In fact, the chemistry of 
La™ might well lead researchers astray, because 
it is known to be unusual — lanthanum is the 
only element to form ions that commonly 
interact directly with ten molecules* (to use 
the jargon, it often has ten ligand molecules in 
its inner sphere). 

So how did Ferrier et al. overcome the 
practical issues of working with actinium? 
They took advantage of recent upgrades at the 
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Figure 1 | Actinide contraction. The radii of the tripositive ions of the first elements of the periodic table’s actinide series become progressively smaller’. 
(Ionic radii are given in angstroms, and ions are shown to scale; thorium, the second element in the series, is not shown because it does not form a stable 
tripositive ion in water.) The size of each ion correlates with the ion’s ability to bind to certain kinds of ligand molecule. Ferrier et al.' report that the 
ligand-binding behaviour of actinium differs substantially from that of americium. 


Stanford Synchrotron Radiation Lightsource, 
a facility in California capable of producing 
extremely bright X-rays, to study microgram 
samples of actinium using a technique called 
X-ray absorption fine-structure spectro- 
scopy — which actinium ions are not invisible 
to. More specifically, the authors studied solu- 
tions of an Ac** salt dissolved in hydrochloric 
acid to observe how water molecules and chlo- 
ride ions bind to the Ac** ions. They also used 
computational modelling to help interpret the 
results. 

The authors find that Ac** binds to nine 
ligands in aqueous solutions, as is common 
for early actinides (actinium to americium in 
the actinide series). This was perhaps unex- 
pected: given the large ionic radius of Ac**, 
one might have anticipated that its inner 
sphere of ligands would have been larger. But 
the real surprise was the finding that the inner 
sphere incorporates more chloride ions than 
was previously thought: the dominant species 
observed was Ac(H,O),Cl;. For comparison, 
Ferrier et al. studied solutions of americium 
(Am), because americium is the first trivalent 
actinide after actinium not to readily undergo 
changes of oxidation state. The primary spe- 
cies they observed in hydrochloric acid was 
[Am(H,0),Cl]**. 

One of the few predictable trends known 
for the actinides is that the ionic radius of 
the tripositive ions contracts as one traverses 
the series from actinium to lawrencium. The 
contraction is quite consistent, averaging 
about 0.01 angstréms between neighbouring 
actinides’ (Fig. 1). Although the contraction 
is gradual, the chemistry of the ions often 
changes considerably in tandem with each 
contraction, so that adjacent actinide com- 
pounds exhibit quite different atomic struc- 
tures, reactivities and electronic properties. 
Protactinium’, plutonium’ and californium® 
are all good examples of actinides whose ion 
chemistry differs substantially from that of 
their neighbours in the periodic table. 

The contracting ionic radius also means 
that the later actinides in the series have higher 
charge densities than earlier ones. Actinide 
ions in general are said to be ‘hard’ cations, 
which means that they bind preferentially to 
small ligands such as water and fluoride ions. 
But Ac** should be the ‘softest’ of the trivalent 


actinides, binding preferentially to softer, 
larger ligands. Ferrier and colleagues’ observa- 
tion that actinium binds to a larger number of 
chloride ions in water than americium is con- 
sistent with this theory, because chloride ions 
are larger than water molecules. This knowl- 
edge can be used to design chelating agents for 
actinium-based radiopharmaceuticals: such 
agents should contain more soft atoms than 
are typically used in these compounds. = 
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Memories linked within 
a window of time 


In mice, two fear-associated memories that are created close in time are represented 
in the brain’s amygdala by the activation of overlapping ensembles of neurons. As a 
result, eliminating the fear of one memory also extinguishes fear of the other. 


HOWARD EICHENBAUM 


wo of my strongest memories are of 

emotional but unrelated events that 

occurred on the same day. On 20 July 
1969, I successfully finished a gruelling month- 
long, round-the-clock experiment. That even- 
ing, I saw men first walk on the Moon. These 
two events are, for me, forever connected. 
Why? Writing in Science, Rashid et al.’ offer 
an explanation — emotionally charged events 
that occur close in time are bound together 
because of an overlap between the ensembles 
of neurons that are excited when the memories 
are laid down. 

In their study, Rashid and colleagues trained 
mice to fear a particular tone (tone 1) by pair- 
ing the sound with a mild electric shock. 
Then, after an interval of between 1.5 and 
24 hours, they trained the same animals to 
fear a different tone (tone 2). If the authors 
subsequently extinguished the animals’ fear of 


tone 2, by repeatedly using it without shock, 
fear responses to tone 1 also decreased — but 
only if the initial training events had occurred 
within 6 hours of each other. Thus, a selective 
link forms between memories of fear-associ- 
ated events that occur close together. 

A brain region called the amygdala is essen- 
tial for associating cues with shock. Select 
ensembles of neurons in this region are acti- 
vated in mice during fear-associated train- 
ing, and are then reactivated when the animal 
recalls the event. Using a sophisticated method 
for marking neurons activated during each 
event, Rashid et al. found a high degree of 
overlap (co-allocation) between the neurons 
activated by the two memories if the memories 
were created within 6 hours of each other, but 
not when created 24 hours apart (Fig. 1). 

Next, the authors used elegant genetic tools 
to force activation of a common set of amyg- 
dala neurons during both learning events 
— a trick that artificially linked memories 
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50 Years Ago 


The Meteorological Office is 
cheerful if restrained about the 
success of its long range weather 
prediction service in its annual 
report for 1965 ... Long range 
forecasts for a period of thirty 
days ahead have been published 
twice a month for the past three 
years ... the Meteorological Office 
says that results have been slightly 
better than expected ... forecasts 
are assessed after the event ... and 
“marks are given for the accuracy 
of forecasts of temperature, rainfall 
and additional information”. 
Predicting temperature seems to 
be the easiest, with twenty-eight 
out of fifty forecasts ranking for the 
mark “good agreement”. On rainfall, 
however, agreement between 
forecasts and reality was good 

on fourteen occasions, moderate 
on seventeen and deserving the 
description “little agreement” 

on nineteen occasions ... The 
long range forecasts are based on 
searches of records going back 

to the middle of the nineteenth 
century for analogous patterns 

of mean temperature and 

mean pressure in the northern 
hemisphere ... What might 

be called objective long range 
forecasting is reckoned to be an 
extremely distant prospect. 

From Nature 20 August 1966 


100 Years Ago 


The firing of very heavy guns ata 
great distance was clearly audible 
at Harpenden throughout the days 
of August 7 and 8, as well as on 
previous occasions. The direction 
of the sound is evidently from 

the south-east, and that of each 
explosion lasts about two seconds. 
Our elevation is 440ft., and the 
local wind has been from west to 
north-west. The distance between 
Harpenden and Bapaume would be 
about 185 miles. 

From Nature 17 August 1916 
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Figure 1 | Co-allocating cell populations to memories by time. Rashid et al.’ trained mice to fear a 
tone (tone 1) that was coupled to a mild electric shock. They then repeated the training for a different 
tone (tone 2), either 6 or 24 hours later. Overlapping populations of neurons in the brain’s amygdala were 
excited during the training sessions 6 hours apart, but not those 24 hours apart. As a result, the memories 
of the training sessions that occurred close together became linked. 


that occurred outside the 6-hour window. By 
contrast, when they inhibited excitation of the 
same population, preventing co-allocation of 
neuronal ensembles, memories that occurred 
within the 6-hour window were separated. 
These experiments confirm that common 
excitability of neurons in the amygdala is 
responsible for co-allocating neurons to mem- 
ories within a window of time. 

A related paper by Cai et al.” was recently 
published in Nature, and reported a similar 
finding — spatial memories that are acquired 
near in time are associated with overlapping 
neuronal ensembles in the brain’s hippocam- 
pus. In this study, the authors monitored 
neuronal activity in mice exposed to one envi- 
ronmental context (context A), then a week 
later to two other contexts (contexts B and C) 
separated by five hours. In line with Rashid 
and colleagues’ findings, Cai et al. observed 
co-allocation of activated hippocampal neu- 
rons as animals explored contexts five hours 
apart, but not seven days apart. They then 
demonstrated the behavioural significance of 
this co-allocation by showing that subsequent 
pairing of shock with context C resulted in 
fear of that context and of context B, but not 
of context A. 

Importantly, the memories of contexts B and 
C were distinct — subsequently extinguishing 
the fear of context C did not extinguish the fear 
of context B. This is in contrast to Rashid and 
colleagues’ findings, and suggests a distinction 
in how timing contributes to the integration of 
memories in different systems. The results in 
the amygdala reflect the region’s role in gener- 
alizing defensive reactions to aversive stimuli. 
By contrast, in the hippocampus, the linkage 
of distinct memories promotes recall of one 
memory when cued by another, reflecting the 
role of this region in supporting the elaborate 
organizational structure and flexibility of our 
conscious memories. 

The role of time in binding memories has 
been recognized since Aristotle's day’. Our cur- 
rent understanding of this process is embodied 
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in a theory called temporal context memory 
(TCM), which was originally developed to 
explain why people have a better memory for 
words that occur close together in a list than for 
those farther apart. The theory was extended 
in 2013 to explain why people better remember 
events that occur within a few thousand sec- 
onds of each other*. TCM theory posits that 
the transient overlapping of neural-activity 
patterns creates a gradually changing 
activity landscape that can link memories that 
are created close together in time and can sepa- 
rate memories that are created far apart. The 
present results confirm TCM theory. 

Consistent with TCM, a neurophysiological 
study in rats has shown’ that firing patterns of 
neural ensembles in the hippocampus gradu- 
ally change over time. This change enables the 
animals to remember the order in which events 
occurred during an experience. Moreover, the 
gradual evolution of the neural networks that 
represent experiences continues for days®”. 
Brain-imaging studies in humans have also 
documented a gradual evolution of hippocam- 
pal neuronal activity associated with successful 
recall of the temporal order of events and with 
distinguishing events that occur at different 
times®”. Thus, in addition to the role of time in 
linking and separating memories described in 
the current studies, time can also sequentially 
organize memories. 

Of course, time is only one variable that 
can link and separate memory representa- 
tions. Previous studies have demonstrated 
that other types of contextual manipulation, 
such as shared or distinct stimuli’ and consist- 
ent or opposing reward associations for the 
same stimulus”, can link or separate neural 
representations of memories that occur close 
in time. Nonetheless, Rashid et al. and Cai et al. 
extend our understanding of the role of time 
in integrating and separating emotional and 
spatial memories, and reveal that allocation of 
neural ensembles underlies the powerful role 
of evolving temporal contexts in linking and 
separating memories. m 
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Mitotic regulation 
comes into focus 


Structural studies provide insight into the mechanisms governing a checkpoint 
in cell division that prevents chromosomes from segregating before they are 
properly aligned on a structure called the mitotic spindle. SEE ARTICLE P.431 


DAVID 0. MORGAN 


somes are pulled apart by a structure 

called the mitotic spindle. This pro- 
cess, called mitosis, is beautiful to watch, 
but fraught with danger: errors can produce 
daughter cells that have unequal genomes, set- 
ting the stage for cell death or cancer. The cell 
goes to great lengths to avoid these mistakes, 
and uppermost among its safety mechanisms 
is the spindle assembly checkpoint’ (SAC) 
—a regulatory system that prevents the cell 
from attempting to separate chromosomes 
if they are not properly attached to the spin- 
dle. The key protein components of the SAC 
were first reported in the summer of 1991 
(refs 2,3). Precisely 25 years on, two papers 
(Alfieri et al.* on page 431 and Yamaguchi et al.” 
in Molecular Cell) describe in near-atomic 
detail how the SAC prevents chromosome 
separation. 

The central protagonist in this story is the 
anaphase-promoting complex/cyclosome 
(APC/C)°. This enzyme attaches the protein 
ubiquitin to specific substrate proteins, tagging 
them for destruction and thereby unleashing 
chromosome separation and the completion 
of mitosis. A key APC/C target, for instance, 
is the protein securin, which inhibits a pro- 
tease enzyme that cuts the proteins holding 
duplicated chromosomes together. Substrate 
binding to the APC/C depends on a protein 
called Cdc20 (Fig. 1a), which binds the APC/C 
during mitosis and recruits substrates through 
interactions with short linear amino-acid 
sequences in the targets called degrons. 

How does the SAC block chromosome 
separation? Chromosomes that are not cor- 
rectly attached to the spindle produce a mitotic 
checkpoint complex (MCC), which contains 
three SAC components (Mad2, BubR1 and 
Bub3) and Cdc20. The MCC binds and inhibits 
Cdc20-bound APC/C (APC/C“*”), forming a 


| ate in cell division, duplicated chromo- 


large complex called APC/C™ that contains 
two copies of Cdc20 (refs 1,6,7). Alfieri et al. 
and Yamaguchi et al. use cryo-electron micro- 
scopy to unveil high-resolution structures of 
this giant assembly. 

A structure of this size and complexity, 
backed up by 25 years of genetic, biochemi- 
cal and structural analysis, contains an over- 
whelming amount of information for the 
aficionado, but its chief value lies in the precise 
description of the mechanisms by which the 
MCC inhibits APC/C“™. Both studies show 
that the MCC interacts with the front face 
of the APC/C, next to the pre-bound Cdc20 
subunit (Fig. 1b). The BubR1 subunit acts 
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as a pseudosubstrate inhibitor — it contains 
two copies of each of the three major degron 
sequences, and wraps around the two Cdc20 
subunits to occupy all degron-binding sites 
on both, thereby blocking substrate binding 
to the APC/C. It is hard to find a more strik- 
ing illustration of the power of short linear 
motifs such as degrons in cell regulation. In 
addition, the MCC prevents binding between 
the APC/C and the E2 coenzyme, which 
normally provides ubiquitin for transfer to 
target proteins. 

The new structures also offer a potential 
explanation for previous evidence that some 
proteins are ubiquitinated by APC/C“*”" 
even when the complex is bound to the MCC. 
For example, the APC/C substrate proteins 
cyclin A and Nek2A are degraded early in 
mitosis, when the SAC is active®. Similarly, 
the Cdc20 subunit of the MCC is tagged with 
ubiquitin by the APC/C, promoting MCC 
turnover’. How does APC/C™™ modify these 
proteins in spite of the mechanisms suppress- 
ing its activity? 

In addition to the ‘closed’ conformation 
described above, the two studies reveal a 
less-common ‘open’ state in which the MCC 
is shifted to allow binding of E2. This tran- 
sient state enables ubiquitination of Cdc20 


Degron 


Figure 1 | Deciphering inhibition of cell division. a, The enzyme anaphase-promoting 
complex/cyclosome (APC/C), depends on an activator subunit called Cdc20 for activity and substrate 
binding. Cdc20 contains binding sites for three short amino-acid sequences called degrons. APC/C 
targets (not shown) that contain these degrons bind these sites and are tagged with the protein ubiquitin 
(Ub), which is transferred from an E2 coenzyme bound nearby. b, Alfieri et al." and Yamaguchi et al.* 
report that the mitotic checkpoint complex, which comprises the proteins BubR1, Bub3, Mad2 and 
another Cdc20, interacts with Cdc20-bound APC/C to form a closed inhibitory state in which E2 binding 
is blocked (Bub3 is not shown, because its position could not be defined in the groups’ structures). 
Disordered regions in the BubR1 subunit contain degrons that occupy all six degron-binding sites on the 


two Cdc20 molecules, preventing substrate binding. 
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by APC/C™“, and might also explain the 
ubiquitination of cyclin A and Nek2A, which 
bind APC/C“*” not only at sites blocked by 
BubR1, but also at other sites (ref. 6). With 
the open structure in hand, we are in a good 
position to unravel these mechanisms. It will 
also be important to explore the dynamics 
of the shift between conformational states, 
and how other regulatory proteins influence 
these dynamics. 

These APC/C™ structures come close 
on the heels of a structural study* by Zhang 
et al. that addressed another major question 
in APC/C regulation. It is known’ that the 
mitotic protein-kinase enzyme Cdk1-cyclin B 
phosphorylates the APC/C to promote its 
activation by Cde20. APC/C“™ then triggers 
cyclin B degradation to inactivate Cdk1. This 
negative feedback is thought to be the basis 
for the oscillator that drives the rise and fall of 
Cdk] activity during the cell cycle’, but much 
about this scheme has been unclear. How does 
phosphorylation activate APC/C“*"? How is 
APC/C activation delayed to allow Cdk] activ- 
ity to rise in early mitosis — even in cell types 
in which the SAC is not present? 

Zhang and colleagues’ analysis, together 
with recent biochemical studies'”"’, yields 
valuable clues. Phosphorylation of a loop 
region in the Apc3 subunit of the APC/C 
provides docking sites for a phosphate-bind- 
ing subunit of Cdk1-cyclin B. Docking of 
Cdk1-cyclin B enhances its activity towards 
suboptimal phosphorylation sites in a loop on 
a different APC/C subunit, Apcl. This disor- 
dered loop contains a short ‘autoinhibitory’ 
segment that occupies a Cdc20-binding site 
on the APC/C, but phosphorylation displaces 
the segment, allowing Cdc20 to bind and acti- 
vate the APC/C. The slow, multistep nature of 
this process provides a plausible mechanism 
for introducing a delay between Cdk] activa- 
tion and APC/C“™ activation, as is required 
for negative feedback to produce a robust 
oscillator”. 

We have not heard the last word on APC/C 
phosphoregulation. The APC/C contains 
many phosphorylation sites in addition to 
those described above®?", leaving open the 
possibility of other regulatory mechanisms, 
or connections between phosphorylation and 
the SAC. 

Another issue also remains unresolved. 
After mitosis, Cdc20 is degraded and the 
APC/C interacts with the Cdc20-related 
protein Cdh1. Cdh1 interacts with the site 
occupied by the autoinhibitory segment of 
Apcl, and yet phosphorylation of this seg- 
ment is not required for Cdh1 binding. How 
does Cdh1 bind this region? Cdh1 might be 
a better competitor than Cdc20 for binding at 
this site®, or there might be other mechanisms 
at play. 

Finally, the Apcl loop and autoinhibitory 
segment are poorly evolutionarily con- 
served outside vertebrates, raising questions 


about APC/C regulation in other species. 
Fortunately, addressing these and related 
problems has just become much easier, now 
that we have a strong structural foundation 
on which to base future experiments. We 
are a major step closer to understanding the 
remarkable robustness of mitosis. = 
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Earth-like planet 
around Sun’s neighbour 


An Earth- mass planet has been discovered in orbit around Proxima Centauri, 
the closest star to our Sun. The planet orbits at a distance from the star such that 
liquid water and potentially life could exist on its surface. SEE LETTER P.437 


ARTIE P. HATZES 


xoplanet discoveries trigger our 
Primssinaons What does the newly 

discovered planet look like and what 
are its characteristics? Could it harbour life 
and, if so, how many such habitable worlds 
are there in our Galaxy? Astronomers aim to 
find Earth-like planets in the temperate zone 
of a star — the distance from the star at which 
the planet’s surface temperature could theo- 
retically support liquid water. On page 437, 
Anglada-Escudé et al.’ report the exciting dis- 
covery of an Earth-mass planet in the temper- 
ate zone of Proxima Centauri, the closest star 
to our Sun. 

Proxima Centauri is a low-mass star (it has 
12% of the Sun’s mass’) that belongs to the 
family of stars known as M dwarfs. Astrono- 
mers are focusing their efforts on finding 
small, potentially habitable planets around 
M dwarfs because such planets can be dis- 
covered with the instruments available 
today. An Earth-like planet orbiting in the 
temperate zone can be detected through a 
‘Doppler wobble’ — the effect caused by the 
planet’s gravitational tug on the motion of its 
host star. This method was used to discover 
the first exoplanet around a Sun-like star’ 
in 1995. 

Applying the Doppler method to an 
M dwarf has two benefits. First, such stars are 
cooler than the Sun, which means that their 
temperate-zone planets would orbit much 
closer to the star (about one-tenth of the 
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Sun—Earth distance’). Second, these stars are 
less massive than the Sun. Both of these factors 
result in a Doppler wobble for M dwarfs that 
is large enough (about 1 metre per second) to 
be detected by modern instruments*. By com- 
parison, Earth induces a Doppler wobble in the 
Sun of 0.09ms"'. 

M dwarfs are the most abundant stars in 
the Galaxy, but so far, only a few Earth-mass 
planets have been discovered in the temper- 
ate zones of such stars’. If only a small fraction 
of M dwarfs have temperate-zone planets, our 
Galaxy could be teeming with life. 

Anglada-Escudé and colleagues first found 
their exoplanet, which is called Proxima 
Centauri b, using Doppler measurements 
taken by the Ultraviolet and Visual Echelle 
Spectrograph at the European Southern 
Observatory (ESO) in Chile between 2000 and 
2008. These data showed a hint — but not an 
entirely convincing one — of a Doppler wobble 
of 1.38ms ‘. The authors confirmed the signal 
by using many more Doppler measurements 
taken in 2016 with the ESO’s High Accuracy 
Radial velocity Planet Searcher. 

The authors’ careful analysis of the data 
eliminated other possible causes for the Dop- 
pler wobble, such as stellar activity. The signal 
is particularly convincing because it is seen 
in two independent data sets that span more 
than a decade. In addition to the fact that the 
planet is orbiting Proxima Centauri, there 
are two further exciting aspects of Anglada- 
Escudé and colleagues’ discovery: the orbital 
period of 11.2 days places the planet within 
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Figure 1 | Artist’s impression of the exoplanet Proxima Centaurib. Anglada-Escudé et al.' have discovered an Earth-like planet in orbit around the closest 
star to our Sun, Proxima Centauri. The planet's surface temperature should allow it to support liquid water, and its mass suggests that it might have a rocky 
surface. In this artist’s impression, the planet is assumed to have an Earth-like atmosphere. 


the temperate zone of Proxima Centauri, and 
the planet is only 1.3 times more massive than 
Earth, so it might be Earth-like (Fig. 1). These 
observations naturally raise the question of 
whether Proxima Centauri b could harbour 
life. However, circling a star at the right dis- 
tance is no guarantee that the planet has liq- 
uid water, or even an atmosphere that can 
support life. 

One problem for Proxima Centauri, and 
for M dwarfs in general, is that these stars 
are much more active than the Sun. Proxima 
Centauri produces powerful flares®, and the 
X-ray flux received by the planet is 400 times 
greater than the flux that Earth receives from 
the Sun. Energetic particles associated with 
the flares may erode the atmosphere or hinder 
the development of primitive forms of life. We 
also don’t know whether the exoplanet has a 
magnetic field, like Earth, which could shield 
it from the dangerous stellar radiation. Until 
we understand what makes a planet habitable, 
it is better to say that Proxima Centauri b lies 
in a temperate zone (the right temperature) 
rather than a habitable zone (the right condi- 
tions to support life). Interestingly, M-dwarf 
stars are long-lived, and Proxima Centauri 
will exist for several hundreds or thousands 


of times longer than the Sun’. Any life on the 
planet could still be evolving long after our Sun 
has died. 

Studies of the exoplanet’s atmosphere 
could assess its habitability. One way to do 
this is by using a technique called transmis- 
sion spectroscopy’: if the planet passes in 
front of (transits) the star when viewed from 
Earth, its atmosphere would absorb the star- 
light while transiting, and so the spectrum of 
the starlight would contain a signature of the 
planet’s atmosphere. But we currently do not 
know whether Proxima Centauri b is such a 
transiting planet — there is only a 1.5% chance 
that it is’. 

One could also attempt to detect the 
reflected or radiated light from the planet 
directly’ — this can be done only for nearby 
planets. Because Proxima Centauri is relatively 
close to us, such attempts have a reasonable 
chance of succeeding. In the distant future, an 
interstellar space probe might get a close-up 
look at the planet. 

NASAs Transiting Exoplanet Survey Satellite, 
which is scheduled for launch in 2017, will 
search for transiting planets around thousands 
of the closest M dwarfs’®. The planets’ atmos- 
pheres could then be characterized using the 


James Webb Space Telescope, scheduled for 
launch in 2018. In the next decade, we will 
learn much more about the atmospheres of 
exoplanets in the temperate zones of M-dwarf 
stars. Meanwhile, Proxima Centauri b gives 
astronomers their best opportunity yet to study 
such planets. m 
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Early onset of industrial-era warming 
across the oceans and continents 


Nerilie J. Abram!, Helen V. McGregor’, Jessica E. Tierney*°, Michael N. Evans®, Nicholas P. McKay’, Darrell S. Kaufman’ & 


the PAGES 2k Consortium* 


The evolution of industrial-era warming across the continents and oceans provides a context for future climate change 
and is important for determining climate sensitivity and the processes that control regional warming. Here we use post- 
AD 1500 palaeoclimate records to show that sustained industrial-era warming of the tropical oceans first developed during 
the mid-nineteenth century and was nearly synchronous with Northern Hemisphere continental warming. The early 
onset of sustained, significant warming in palaeoclimate records and model simulations suggests that greenhouse forcing 
of industrial-era warming commenced as early as the mid-nineteenth century and included an enhanced equatorial 
ocean response mechanism. The development of Southern Hemisphere warming is delayed in reconstructions, but this 
apparent delay is not reproduced in climate simulations. Our findings imply that instrumental records are too short 
to comprehensively assess anthropogenic climate change and that, in some regions, about 180 years of industrial-era 
warming has already caused surface temperatures to emerge above pre-industrial values, even when taking natural 


variability into account. 


Palaeoclimate data from the past two millennia—a period for which 
natural and anthropogenic climate forcings are reasonably well 
constrained—provide perspectives on global temperature changes during 
the twentieth century. Climate reconstructions of the past 2,000 years 
have focused mainly on the Northern Hemisphere’~’, using records 
derived primarily from terrestrial settings. Recent continental-scale 
temperature reconstructions provide evidence for twentieth century 
warming over all reconstruction regions except Antarctica*. A new 
Southern Hemisphere temperature reconstruction also demonstrates 
that the twentieth century is the only period of the last millennium 
during which warm extremes occurred simultaneously across both 
hemispheres°. However, these hemispheric and regional temperature 
histories do not allow for assessments of how past temperature changes 
evolved between the oceans and land. 

The oceans represent a major heat reservoir, taking up more than 
90% of the total global energy imbalance since the 1950s°. Internal 
variability of ocean circulation mediates the global climate and, for 
example, is implicated in the slowdown of global atmospheric warming 
during the ‘hiatus’ interval’ (ap 2001-2014) because of the drawdown 
of additional heat into the subsurface ocean”*. The existence of the 
recent warming slowdown is debated’; however, earlier decade-scale 
plateaus in the rate of warming are prominent features of the climate 
record'®, Given the importance of the oceans in determining the pace 
and regional structure of changes in climate”, it is essential to under- 
stand how anthropogenic warming developed in the oceans and over 
land during the industrial era. 

Determining an unambiguous time for the start of the industrial era 
is difficult, and forms part of the debate over a formal definition of the 
Anthropocene'*’?, The Intergovernmental Panel on Climate Change 
(IPCC) uses ‘industrial era to refer, somewhat arbitrarily, to the time 
after AD 1750, when industrial growth began in Britain, spread to other 


countries and led to a strong increase in fossil fuel use and greenhouse 
gas emissions. Here, we use the term ‘industrial-era warming’ to refer 
to the sustained, significant (P < 0.1) warming of Earth’s climate that 
developed during the industrial era. We use the palaeoclimate history 
since AD 1500 as a context for assessing the evolution of industrial-era 
warming across surface-ocean and land areas. Our assessments use 
newly developed regional sea-surface temperature (SST) reconstruc- 
tions for the tropical oceans’ and SST-sensitive records for the global 
oceans!*!5, along with continental-scale temperature reconstruc- 
tions and databases*!® (Fig. 1, Methods). We compare the onset of 
industrial-era warming in these palaeoclimate datasets to transient multi- 
model climate simulations driven by full natural and anthropogenic 
forcings’’. We also use experiments with single'**° and cumulative! 
external climate forcings to investigate the factors that define the onset 
of industrial-era warming. 


Regional features of industrial-era warming 

Synthesis of marine palaeoclimate records spanning the past 2,000 
years has identified a robust global surface-ocean cooling trend that 
reached coolest conditions during the period ap 1400-1800!°. This 
finding is qualitatively consistent with pre-industrial cooling trends in 
terrestrial records and can be explained by an increased frequency of 
explosive volcanism during the past millennium!>”. Marine records 
with moderate-to-high (<25 yr) temporal resolution indicate that, 
in many regions, this long-term SST cooling trend reversed during 
the industrial era!*!°, including in the tropical oceans where robust 
regional SST reconstructions spanning the past four centuries have 
been developed using coral archives'*, Industrial-era warming in the 
area-weighted average of regional tropical SST reconstructions is vis- 
ually similar to warming of the global area-weighted mean of terrestrial 
temperature reconstructions? (Fig. 1c, d). In particular, the average 
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Figure 1 | Terrestrial and marine palaeoclimate reconstructions. a, b, 
Instrumental temperature trends over the period ap 1961-2010 for surface 
air temperature (SAT; a) and sea surface temperature (SST; b). The black 
boxes indicate the PAGES 2k continental-scale reconstruction regions* 
(a) and the PAGES Ocean2k high-resolution (annual) reconstruction 
regions'* (b). The symbols in b indicate the locations of highly resolved’* 
(blue circles; Extended Data Table 1) and moderately resolved'* (<25 yr, 
purple squares; crossed squares indicate ocean upwelling sites; Extended 
Data Fig. 7) marine palaeo-SST proxy records. c, d, 25-year moving 
averages of the area-weighted mean terrestrial temperature anomaly 
(brown line)*’*® and the 25%-75% range across continental-scale 
reconstructions (shading) (c) and area-weighted mean tropical SST 
anomaly (blue line) and minimum-maximum range across the Indian, 
western Pacific and western Atlantic reconstructions’ (shading) (d). 
Anomalies are relative to the AD 1961-1990 mean (dashed lines). 


terrestrial and tropical ocean temperature histories show industrial-era 
warming developing after ap 1800, with similar multi-decadal expres- 
sions of accelerated and reduced warming phases. 

The similarity of average terrestrial and marine temperature histo- 
ries (Fig. 1c, d) masks important regional differences in industrial-era 
warming (Fig. 2, Extended Data Fig. 1). To examine these regional 
features, we first assess when sustained, significant warming began 
in the regional temperature reconstructions. We define a sustained, 
significant trend as the most recent trend that persists until the end of 
the reconstruction and that is significantly different from zero above 
the 90% confidence level (P< 0.1). We determine the median time of 
onset of these sustained trends across different levels of smoothing 
(15-50 y) applied to the regional reconstructions (Methods, Extended 
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Data Fig. 2a)*?**. The strengths and limitations of this and other 
change-point detection methods in assessing the onset of industrial-era 
warming are explored using synthetic time series (Methods, Extended 
Data Fig. 3). Model-based testing suggests that estimates for warm- 
ing onset are insensitive to the seasonal preference that exists in some 
regional reconstructions* (Methods, Extended Data Fig. 4). 

Sustained, significant warming began in the tropical oceans around 
the 1830s, with no discernible difference in onset across the three 
tropical SST reconstruction regions (Fig. 2a, Table 1, Extended Data 
Fig. 1a, b). The onset of tropical-ocean warming is similar to the 
median onset of warming in Northern Hemisphere mean temperature? 
(Extended Data Fig. 1a), although the Northern Hemisphere ensemble 
also includes a sub-member where recent warming is not sustained to 
the end of the reconstruction (Supplementary Fig. 1). This exception 
is due to strong multi-decadal variability, which is common to each of 
the Northern Hemisphere mid-latitude continental regions and can 
delay the detection of sustained warming trends at narrow filter widths 
(Extended Data Fig. 5). Nevertheless, each of the Northern Hemisphere 
regional-scale reconstructions also displays mid-nineteenth-century 
onsets for industrial-era warming (Fig. 2a, Table 1, Extended Data 
Fig. la, b). 

In contrast to the mid-nineteenth-century onset of warming in 
the tropical oceans and the Northern Hemisphere, the Southern 
Hemisphere onset of industrial-era warming appears delayed (Fig. 2a, 
Table 1). In hemispheric-scale reconstructions*”, the median estimate 
for the onset of sustained warming is approximately 50 years later in the 
Southern Hemisphere than the Northern Hemisphere (Extended Data 
Fig. 1a, Supplementary Fig. 1). The regional structure of this appar- 
ent Southern Hemisphere lag involves sustained, significant warming 
developing over Australasia and South America around the start of 
the twentieth century, while significant continent-scale warming is not 
detected for Antarctica (Table 1, Extended Data Fig. 1a). The Antarctic 
reconstruction has the greatest uncertainty of the regional reconstruc- 
tions* (Methods), and significant warming has been documented over 
the Antarctic Peninsula and West Antarctica since the mid-twentieth 
century”*°, However, the absence of significant Antarctic warming at 
the continent scale is corroborated by post-1979 satellite observations 
averaged across Antarctica”®, 

Industrial-era warming across the oceans and continents is further 
investigated by examining the rates of regional warming. All century- 
scale linear trends in the regional reconstructions since AD 1500 are 
calculated (100-yr trends with 1-yr time step; Supplementary Video 1) 
and the distributions of trends beginning since AD 1800 are used to 
assess the regional rates of industrial-era warming. For each tropical 
ocean and Northern Hemisphere regional reconstruction, the distri- 
bution of century-scale trends starting after AD 1800 has a clear pos- 
itive shift compared to earlier trends (Fig. 2b, Table 1, Extended Data 
Fig. 1c) and includes the largest century-scale warming trend of the 
past 500 years. Temperature trends in the Arctic since ap 1800 are 
greater than in any other region, indicative of Arctic amplification”’. 
The similarity of post-ap 1800 trends for the tropical Indian and 
western Atlantic oceans with those in Europe, Asia and North America 
(Table 1), indicates that industrial-era warming of the tropical oceans 
has progressed at a rate similar to warming of the Northern Hemisphere 
mid-latitude continents. By contrast, rates of century-scale warming 
since AD 1800 in the Southern Hemisphere regional reconstructions 
are slower than for the tropical oceans and Northern Hemisphere con- 
tinents (Fig. 2b, Table 1). This difference may be related to the delayed 
onset of warming in the Australasia and South America reconstruc- 
tions, but is also consistent with instrumental evidence for hemispheric 
asymmetries in the rate of twentieth-century warming (Fig. la, b). For 
Antarctica, the absence of continent-scale warming during the indus- 
trial era results in pre- and post-ap 1800 trend distributions that are 
statistically indistinguishable (Extended Data Fig. Ic). 

The time when a climate-change signal exceeds the range of climate 


variability is known as the ‘time of emergence’. The time of emergence 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


Onset of warming 


100-yr trend distributions 


ARTICLE 


Climate emergence 


Arctic F 


Europe 

Asia 

North America 

Western Atlantic Ocean 
Western Pacific Ocean 


Indian Ocean 


Australasia 


South America dg 


Antarctica NINE 


1500 1600 1700 1800 1900 2000 -1.5 -1.0 
Year 


Figure 2 | Onset and magnitude of industrial-era warming in regional 
temperature reconstructions. a, Regional reconstructions’" since 

AD 1500 (coloured lines) with 15-yr (thin black lines) and 50-yr (thick 
black lines) Gaussian smoothing, shown alongside the median time of 
onset for sustained, significant industrial-era warming assessed across 
15-50-yr filter widths (vertical black bars) (Methods, Extended Data 

Fig. 2a). Grey 1°C scale bar denotes the y-axis scale of each regional 
temperature reconstruction. b, Histograms of century-scale regional 
trends, comparing those beginning during the period ap 1500-1799 
(grey bars) with those beginning since ap 1800 (coloured bars). The heights 


for industrial-era warming depends on (i) when warming began, (ii) 
the rate of warming and (iii) the magnitude of interannual to multi- 
decadal climate variability. Time-of-emergence studies typically use twen- 
tieth century instrumental data or post-1850 (historical) simulations to 
characterize the baseline climate, and have commonly concluded that 
unprecedented climates will emerge first in tropical air temperatures 
because of the small magnitude of interannual temperature variability in 
these regions”* °°. However, our findings of a mid-nineteenth-century 
onset of industrial-era warming suggest that, in some regions, the entire 
instrumental period contains a signature of climate warming, rendering 
it unsuitable for determining climate emergence. We use the multi- 
century context available from the regional palaeoclimate reconstruc- 
tions, and a pre-aD 1800 reference period (Methods), to assess the 
extent to which industrial-era warming may have already emerged in 
regional climates. 

Industrial-era warming led to the emergence of regional climate 
change first in the Arctic (Fig. 2c, Table 1). Despite the large variability 
of Arctic climate, the palaeoclimate time-of-emergence assessment 
indicates that the early onset and rapid rate of warming resulted in 
the emergence of climate change during the 1930s (approximately 
100 years after sustained, significant warming began). The tropical 
ocean regions display a similar rate of warming to that of the Northern 
Hemisphere mid-latitude continents, but the industrial-era warming 
signal emerges sooner in the tropical oceans (time of emergence at 
around Ap 1948-1962) because of the smaller magnitude of varia- 
bility there. Emergence of industrial-era warming for Australasia is 
around ap 1960, because the delayed onset of warming is compen- 
sated by the small magnitude of interannual variability in this regional 
reconstruction. All other regions apart from Antarctica are nearing 
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of the bars are normalized by the maximum occurrence across each region 
to aid visual representation. See Supplementary Video 1 for temporal 
evolution of century-scale trends. c, Regional reconstructions (coloured 
lines) with 15-yr (thin black lines) and 50-yr (thick black lines) filters, 
shown alongside the +20 range (shaded boxes) of interannual variability 
over the AD 1622-1799 reference period (+20 level continued by dashed 
lines). The median time of climate emergence (vertical black bars) is 
defined as the time at which the climate signal (15-50-yr width filters of 
the regional reconstructions) first exceeds and remains above the +20 
threshold of the reference period (Methods). 


the emergence of warming above the threshold of pre-industrial 
climate variability by the start of the twenty-first century (the end of 
the reconstructions). 

Our regional palaeoclimate assessments suggest that widespread 
climate warming observed during the twentieth century forms part 
of a sustained trend that began in the tropical oceans and over some 
Northern Hemisphere land areas around the 1830s (about 180 years 
ago). Although caveats exist relating to the accuracy with which 
regional palaeoclimate reconstructions represent past temperature 
changes*!*"'®, our multi-century assessments clearly demonstrate the 
need to incorporate pre-twentieth-century information in comprehen- 
sive assessments of industrial-era warming. The early onset of industrial- 
era warming may not alter the conclusions of time-of-emergence studies 
focused on protecting infrastructure built during recent decades*!; 
however, our findings imply that time-of-emergence studies that rely on 
a twentieth-century baseline may underestimate how soon the effects of 
climate change will fall outside the range of climate variability to which 
natural systems are adapted*®. 


Climate forcing of industrial-era warming 

Model simulations provide an important tool for investigating 
which forcings are most consistent with the reconstructed onset of 
industrial-era warming. We examine the regional responses of global 
climate model simulations to natural and anthropogenic forcings since 
AD 1500, applying the same trend-detection methodology used for the 
palaeoclimate reconstructions (Methods). An ensemble of ten different 
models reproduces the near-synchronous mid-nineteenth-century 
onset of sustained, significant warming observed for reconstructed 
Northern Hemisphere surface air temperature and tropical SST 
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Figure 3 | Data-model comparison of the onset of industrial-era 
warming. a—d, Median onset of sustained, significant warming in regional 
reconstructions (vertical grey bars; as in Fig. 2a, vertical black bars) 
compared to various ensemble results. a, Median (vertical blue bars), 
25%-75% range (boxes) and 5%-95% range (horizontal lines) of the 
corresponding regional median warming onsets across ten multi-model last- 
millennium climate simulations with full radiative forcings (Supplementary 
Fig. 1). b, Median timing (open symbols) of warming onset across three- 
member ensembles for single radiative forcing experiments (green circles, 
orange diamonds and red triangles) and a ten-member ensemble with full 
radiative forcings (blue squares) using the LOVECLIM model'®. c, Median 


(Fig. 3a, Table 1). Palaeoclimate data~model agreement is particularly 
good for Northern Hemisphere terrestrial regions, where the patterns 
of short-term cooling caused by volcanic eruptions and sustained 
recent warming from greenhouse gas emissions are remarkably 
similar (Extended Data Fig. 2). The agreement between the multi- 
model ensemble and palaeoclimate reconstructions suggests 
that the onset of industrial-era warming over Northern Hemisphere 
landmasses and in the tropical oceans is consistent with a forced 
climate response. 

By contrast, none of the climate models show evidence for a delayed 
Southern Hemisphere onset of industrial-era warming (Fig. 3a). A 
ten-member ensemble of LOVECLIM simulations also suggests that 
the delayed development of warming over Antarctica and Australasia 
is not explicable within the range of unforced climate variability in 
that model (Fig. 3b). Instead, the evolution of regional temperature 
trends in the multi-model ensemble mean (Extended Data Fig. 2b) 
and within individual models (Supplementary Fig. 2) shows a glob- 
ally synchronous thermodynamic response of surface temperatures to 
external climate forcings. Previous palaeoclimate studies have noted 
that climate models tend to overemphasize Northern Hemisphere- 
Southern Hemisphere synchronicity of past temperature changes**”, 
perhaps by overestimating externally forced climate responses in 
the Southern Hemisphere and/or underestimating the magnitude 
of Southern Hemisphere climate variability**. Unresolved or inac- 
curate physical processes in model representations of the Southern 
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timing (open symbols) of warming onset for three-member ensembles of 
CSIRO Mk3L simulations using cumulative radiative forcing experiments”. 
d, Median (green vertical bars), 25%-75% range (boxes) and 5%-95% 
range (horizontal lines) of regional median warming onsets for 13 ensemble 
members across four different climate models forced with only greenhouse 
gas changes'*!. Lower panel in a shows the corresponding full radiative 
forcing (from orbital, greenhouse gas, solar and volcanic sources)’”. 

Lower panel in d shows the corresponding equivalent atmospheric CO 
concentration relative to the mean (dashed line) and +1.5 interquartile 
range (grey shading; outlier test) of an AD 0-1500 baseline interval. 

See Extended Data Table 2 for model details. 


Hemisphere ocean—atmosphere-cryosphere systems also hinder the 
accurate simulation of Southern Hemisphere climate*®*? including 
the overestimation of simulated Antarctic region warming com- 
pared with satellite observations of recent surface air and ocean 
temperature trends”®. The paucity of climate observations also 
hinders attempts to resolve differences between observations and 
simulations in the Southern Hemisphere”? ? Asa result, currently 
available model output cannot reasonably be used to assess the 
delayed onset of Southern Hemisphere industrial-era warming that is 
suggested by palaeoclimate observations. 

Naturally forced climate cooling may have helped to set the stage for 
the widespread onset of industrial-era warming in the tropical oceans 
and over Northern Hemisphere landmasses during the mid-nineteenth 
century. Episodic cooling caused by the large 1815 Tambora volcanic 
eruption is prominent in Northern Hemisphere terrestrial temperature 
reconstructions (Fig. 2a). In last-millennium model simulations, the 
strong cooling caused by the Tambora eruption is followed immediately 
by a decade-scale interval of accelerated global warming as the cli- 
mate recovers™*. The Dalton solar minimum also occurred in the early 
nineteenth century, but solar forcing is thought to have only a small 
influence on last-millennium climate compared to the effects of vol- 
canic eruptions”’. In the CSIRO Mk3L experiments, in which forcings 
were applied cumulatively rather than individually, it is particularly the 
addition of volcanic forcing that tends to focus the onset of industrial- 
era warming into a narrower time window—both between regions and 
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Table 1 | Onset, rate and emergence of industrial-era warming in reconstructions and simulations 


Year of onset of sustained, significant 
warming trends 


Century-scale trend distribution 


CC per century) Year of emergence 


Reconstructions Simulations 


Reconstructions Reconstructions 


AD 1500-1799 Since AD 1800 


Arctic 1831 1843 (1819-1880) —0.11 (—0.41-0.07) 1.07 (0.92-1.26) 1930 
Europe 1852 1888 (1840-1987) 0.02 (—0.39-0.33) 0.46 (0.29-0.58) 1994+ 
Asia 1849 1833 (1807-1999) 0.00 (—0.18-0.10) 0.48 (0.35-0.54) 1987+ 
North America 1847 1859 (1823-1900) 0.01 (—0.38-0.15) 0.48 (0.29-0.52) 8 
Western Atlantic Ocean 1828 1836 (1811-1879) —0.11 (—0.21-0.01) 0.41 (0.27-0.51) 1948 
Western Pacific Ocean 1834 1830 (1818-1836) —0.09 (—0.13-0.02) 0.27 (0.19-0.35) 1962 
Indian Ocean 1827 1830 (1814-1838) —0.14 (—0.31-—0.04) 0.51 (0.39-0.54) 1962 
Australasia 1904* 1832 (1808-1833) 0.00 (—0.04-0.05) 0.07 (0.02-0.23) 1959 
South America 1896 1840 (1802-1880) —0.02 (—0.11-0.21) 0.20 (—0.02-0.38) 8 
Antarctica t 1839 (1819-1851) —0.05 (—0.15-0.09) —0.06 (—0.14-0.07) 8 
Statistics represent changes in surface air temperature (SAT) for continental reconstruction regions and sea surface temperature (SST) for tropical ocean regions. The values given are medians 
(inter-model medians for the simulations), with the 25%-75% ranges in parentheses. The reconstructed median years of onset and emergence are determined using 15-50-yr filter widths. Visual 
representations of the data are found in Fig. 2a (onset of warming in reconstructions), Fig. 3a (onset of warming in simulations), Fig. 2b (rates of warming) and Fig. 2c (climate emergence). 


“Compared with a median warming onset of AD 1886 in the original Australasia 2k reconstruction? that includes marine SST-sensitive records. 


tSustained, significant warming never achieved in reconstruction. 


+Emergence of industrial-era warming above the AD 1622-1799 (reference interval) variability was within 20 years of the end of the reconstruction, making permanent emergence uncertain. 


SEmergence was not achieved in reconstruction. 


between ensemble members for the same region—compared to exper- 
iments run with greenhouse forcing alone (Fig. 3c). 

Simulations suggest that recovery from volcanic cooling is not an 
essential requirement for reproducing the mid-nineteenth-century 
onset of industrial-era warming. Multi-model experiments forced 
with only greenhouse gases capture regional onsets for sustained 
industrial-era warming that are consistent with the tropical ocean and 
Northern Hemisphere continental reconstructions (Fig. 3d). Testing of 
our change-point detection method also indicates that volcanic-style 
cooling events do not substantially alter the onset determined for sus- 
tained warming trends in synthetic time series (Methods, Extended 
Data Fig. 3a, b). We conclude that greenhouse forcing of industrial-era 
warming began by the mid-nineteenth century, but that the confluence 
of explosive volcanic eruptions around this time was probably also 
influential in aligning the onset of warming over tropical ocean and 
Northern Hemisphere land regions. 


Mechanisms of industrial-era warming 
Our regional palaeoclimate assessments show that the thermodynamic 
response to increasing greenhouse gas concentrations developed in 
the oceans and atmosphere even when anthropogenic contributions 
were small. The spatial fingerprint of the onset of industrial-era warm- 
ing may further elucidate the role of the oceans in the development of 
anthropogenic warming. We explore this by assessing change-points 
in the site-level palaeoclimate records that contribute to the marine!*> 
and terrestrial*!® temperature reconstructions (Extended Data 
Fig. 6). Onset estimates from site-level records are more variable than 
for the regional reconstructions. This variation arises from (i) lower 
trend-to-variability (or signal-to-noise) ratios in individual palaeocli- 
mate records, (ii) varying lengths of the individual records that do not 
always include information before the onset of sustained industrial-era 
trends and (iii) differences in how representative each record is of local 
temperature. Issues similar to (i) and (ii) limit the detection and attribu- 
tion of climate change at subregional scales from climate observations 
and simulations*. For these reasons, we view the site-level analyses 
in only a qualitative sense (Fig. 4), using them to aid the more robust 
assessments derived from regional reconstructions (Fig. 2). 
Sustained, significant warming trends developed during the indus- 
trial era across the majority (71%, n =55) of marine records (Extended 
Data Fig. 6a). This complements similar findings of widespread recent 
warming trends in site-level palaeoclimate records from predominantly 
terrestrial environments*. Development of sustained warming during 
the nineteenth century in individual records from the tropical oceans 
and in Northern Hemisphere mid- and high-latitude terrestrial records 


(Fig. 4) corroborate our findings based on the regional reconstructions. 
Some Southern Hemisphere mid-latitude terrestrial records also show 
this early warming (Fig. 4a), suggesting that although continental-scale 
temperature reconstructions for Australasia and South America indicate 
a delayed onset of industrial-era warming (Fig. 2), this may not be rep- 
resentative of all Southern Hemisphere mid-latitude land areas (Fig. 4a). 

The early onset of industrial-era warming of the tropical oceans 
was widespread; however, industrial-era climate changes may 
have resulted in localized surface-ocean cooling in some settings 
(Extended Data Figs 6b, 7a). Previous assessments of the moderately 
resolved marine records cautiously concluded that qualitative warm- 
ing and cooling trends during the twentieth century were produced 
if the records were composited into a priori defined non-upwelling 
and upwelling subsets, or if the records were composited by tropi- 
cal versus Northern Hemisphere extratropical location, or by proxy 
type’®. Our change-point assessment of the moderately resolved 
marine records finds that the most distinct differentiation between 
recent cooling and warming trends occurs when the records are 
separated into upwelling and non-upwelling sites (Extended Data 
Fig. 7b-d). Therefore, enhanced ocean upwelling may be the most 
plausible mechanism for explaining the recent cooling trends detected 
at some marine sites, consistent with theories that climate warming 
could, in some locations, cause strengthening of the surface winds 
that generate coastal upwelling****. 

The early development and rapid rate of tropical ocean warming 
during the industrial era (Figs 2, 4b) may corroborate model-based 
descriptions of an enhanced equatorial response of the oceans to 
increased greenhouse forcing**“°. The warming near the equator as a 
result of an enhanced equatorial response is caused by increased sur- 
face ocean stability and a reduction in surface evaporative cooling due 
to the combination of lower wind speed and relative humidity. The 
hypothesized enhanced-equatorial-response mechanism and spatial 
fingerprint differs from an El Nifio-like response of the tropical oceans 
to global warming (requiring weaker Walker circulation, reduced east- 
ern Pacific upwelling and reduced east-west SST gradient)*’ and shows 
greater consistency between models*’. A regional SST reconstruction 
for the eastern Pacific, although not used in our study because it is 
thought to have a spuriously large twentieth-century trend attributed 
to hydrologic effects'*, indicates that sustained, significant warming 
of eastern Pacific SST began markedly later (around ap 1913) than in 
the other tropical ocean regions". Full-forcing and greenhouse-only 
climate model simulations assessed spatially at a grid resolution of 
5° x 5° (Extended Data Fig. 8) also display a delayed onset of industrial- 
era warming in the eastern Pacific over a narrow region along the 
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Figure 4 | Latitudinal development of site-level temperature trends. 

a, b, Cumulative distributions for terrestrial (a) and marine (b) records, 
showing latitudinal development of sustained significant warming (red; 
upward) or cooling (blue; downward) trends across the site-level records 
(Extended Data Fig. 6). The distributions are expressed as a proportion 
of total records within each latitudinal band (number of total records, n, 
indicated; grey values indicate an insufficient number of site-level records 
(n <2) for meaningful comparison). Light shading in b denotes trends at 
marine sediment core sites with an a priori upwelling regime (Extended 


equator, distinct from the otherwise widespread early onset of tropical 
surface ocean warming (Extended Data Fig. 8, Supplementary Fig. 3). 
The delayed detection of sustained, significant warming in the eastern 
Pacific may be influenced by large-amplitude interannual variability 
there (Extended Data Fig. 3b, c), which limits our ability to confidently 
evaluate El Nifio (or La Nifia)-like dynamic mechanisms in defining 
the onset of industrial-era warming across the oceans. Nevertheless, the 
early onset of industrial-era warming in tropical ocean regions away 
from the equatorial eastern Pacific, in reconstructions and simulations, 
appears to support an enhanced equatorial response in the SST changes 
caused by global warming. 

Widespread warming of tropical SSTs during the industrial era 
(Figs 2, 4b) may have had global importance because of the non- 
linear influence of tropical SSTs on deep atmospheric circulation, 
which redistributes heat and moisture. The latitudinal development 
of terrestrial warming that we observe across individual palaeo- 
climate records (Fig. 4a) is similar to that reported for twentieth- 
century terrestrial air temperature observations”. In that study it was 
proposed that twentieth-century terrestrial warming focused over the 
Northern Hemisphere subtropical-subpolar regions and in a narrow 
band over the Southern Hemisphere subtropics could be indicative of 
anthropogenically forced widening of the tropics through expansion 
of the Hadley circulation cells*”. The causes of recent multi-decadal 
episodes of contraction and expansion of Hadley circulation*?* have 
proven difficult to resolve because of the brevity of observational 
datasets. However, numerous studies suggest that tropical ocean 
warming is essential for reproducing the recent poleward expansion 
of the Hadley circulation, because of the effect of SST on tropospheric 
temperature, tropopause height, and baroclinic wave position and 
stability“. Further assessments of marine and terrestrial palaeo- 
climate networks, including compilations of hydroclimate-sensitive 
records currently in development, have the potential to provide a context 
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Data Fig. 7), for which the localized SST response*** to regional climate 
warming may not be representative of latitudinal average climate. Dashed 
lines show the temporal coverage of site-level records (expressed as a 
proportion of latitudinal band total, n). c, The number and type of proxy 
records available by latitudinal band (bars are coloured by archive type). 
Because of the varying length of individual records, not all of which cover 
the full interval of industrial-era warming (particularly tropical corals), we 
use these data only as a qualitative indicator of the relative timing of the 
onset of industrial-era warming between different latitudinal bands. 


for the role of rapid tropical ocean warming during the industrial era 
in widening the tropical climate belt. 

Differences between Southern Hemisphere reconstructions and sim- 
ulations, and the lack of suitable palaeoclimate records from the extrat- 
ropical Southern Hemisphere oceans, currently preclude an assessment 
of the role of the oceans in the delayed onset of Antarctic warming that 
is seen in regional (Fig. 2) and site-level (Fig. 4a) palaeoclimate anal- 
yses, and in observational records”°, Idealized model experiments of 
ocean heating predict a centennial-scale delayed onset of anthropogenic 
warming in the Southern Ocean caused by upwelling of unmodified 
subsurface water and northward advection of any surface warming 
signal’!*”. The strengthening of westerly winds over the Southern 
Ocean during the twentieth century, related to the Southern Annular 
Mode*’, have probably also influenced the delayed development of 
sustained industrial-era warming over Antarctica. In the Northern 
Hemisphere, the scarcity of suitable marine palaeoclimate records from 
the mid- to high latitudes also precludes an assessment of ocean-land 
relationships during the onset of industrial-era warming. However, we 
find that in full-forcing and greenhouse-only simulations the onset 
of sustained surface ocean warming in the North Atlantic Ocean 
is delayed, or instead is characterized by cooling (Extended Data 
Fig. 8). This finding is consistent with reports of an unusual slowdown 
of Atlantic Meridional Overturning Circulation during the twentieth 
century**. Increasing knowledge of the temperature evolution of the 
extratropical oceans before and during the industrial era should be 
considered a key target for future palaeoclimate research. 

The spatial development of industrial-era warming across the oceans 
and continents demonstrates that the tropical oceans and Northern 
Hemisphere were particularly responsive to the climate forcings that 
shaped industrial-era warming. The mid-nineteenth-century com- 
mencement of industrial-era warming suggests that Earth’s surface 
temperature may respond to even small increases in greenhouse gas 
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forcing more rapidly than previously thought*”, and highlights the 
importance of multi-century palaeoclimate records and model simu- 
lations in assessing the response of worldwide climate to anthropogenic 
greenhouse gas emissions. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Palaeoclimate data. We use the newly developed SST reconstructions and pal- 
aeoclimate databases of the PAGES Ocean2k working group. For full details of 
the data selection criteria and in-depth discussions of the marine datasets and 
reconstructions, see refs 14 and 15. 

The high-resolution (annual or better) component of the Ocean2k dataset con- 
sists of 57 coral records (including multiple sensors for some records; Extended 
Data Table 1), which were used to reconstruct regional mean SST histories for 
different sectors of the tropical oceans!“. The SST reconstructions for the tropical 
Indian, western Pacific and western Atlantic regions are statistically robust over 
most of the past 400 years'*. The same reconstruction method applied to the trop- 
ical eastern Pacific region yielded poorer statistics and a twentieth-century SST 
trend that is stronger than suggested by instrumental records. This was attributed 
to nonlinear hydrologic effects in the eastern tropical Pacific!*; hence, we exclude 
this regional reconstruction from the current study. We use the ‘best’ reconstruc- 
tion for each region, as identified in ref. 14 on the basis of validation statistics 
from the ensemble of reconstructions produced. From the Ocean2k low-resolution 
database!" we use a subset of 21 marine records that are suitable for the recent 
temperature trend analysis carried out in this study (Extended Data Fig. 7a). These 
21 records meet additional criteria of having strong chronological control through 
210Pb profiling or counting of annual sediment layers or coral growth bands, as 
well as an average sample resolution of 25 years or better; we refer to these as 
moderate-resolution records. The sense of the proxy-temperature response of the 
high- and moderate-resolution ocean palaeoclimate records is based on known 
physical relationships for the incorporation of geochemical and biological tracers 
into these records or on individually assessed and published temperature-growth 
relationships. 

We compare the ocean records to the continental-scale temperature recon- 
structions and palaeoclimate databases developed in phase 1 of the PAGES 2k 
project’, including the updated Arctic v1.1.1 reconstruction and database’®. For 
the North America region we use the tree-ring-based reconstruction, which has 
decadal resolution, rather than the lower-resolution pollen-derived temperature 
reconstruction. To avoid record duplication, all marine records were removed 
from the PAGES 2k continental database before site-level trend analysis. We also 
exclude the four instrumental records in the South American 2k database from 
our site-level data assessment so that all information is derived solely from pal- 
aeoclimate archives. At the level of regional temperature reconstructions, a small 
degree of overlap exists in the contributing data used for some of the previously 
published reconstructions. Specifically, 9 of the 28 records used for the Australasia 
2k temperature reconstruction are also used for the tropical western Pacific SST 
reconstruction, and 1 is used for the tropical Indian Ocean SST reconstruction. 
To avoid any bias introduced by data overlap, we use a terrestrial-only Australasia 
2k reconstruction that was produced using the same methodology as the original 
reconstruction’, but excludes any of the marine geochemistry records that are 
used in the Ocean2k high-resolution reconstructions. Details of this terrestrial- 
only reconstruction, and the reconstruction data and statistics, accompany this 
paper as Supplementary Data 1. The terrestrial-only reconstruction demonstrates 
close agreement with the original Australasia 2k reconstruction and none of the 
interpretations presented here is altered by using the original reconstruction 
instead of the terrestrial-only version. 

Matlab data structures containing the site-level proxy data and regional 

reconstructions used in this study are archived with the National Centers for 
Environmental Information (NCEI) World Data Service for Paleoclimatology at 
http://www.ncdc.noaa.gov/paleo/study/20083. 
Palaeoclimate data analysis and uncertainties. The analyses performed in this 
study use annually resolved, unsmoothed input data. For the moderately resolved 
marine records (resolution given in Extended Data Fig. 7a) and the North America 
reconstruction (decadal resolution), pseudo-annual data was produced by per- 
forming a nearest-neighbour interpolation to produce stepped datasets that con- 
tinued values across the entire sampling interval that they represent. Chronological 
uncertainty in the palaeoclimate records and reconstructions is extremely low for 
the industrial era. Annual layer chronologies would be expected to be known to 
within +2-3 yr back to ap 1800, using conservative estimates and not taking into 
account the 1815 Tambora eruption that left an unambiguous fixed-time marker 
in many palaeoclimate archives. Therefore, chronological uncertainty in the indus- 
trial era is well within the level of interpretability of our estimates for the onset of 
sustained, significant warming based on the change-point detection method (see 
Methods section ‘Change-point method testing’). 

The area weightings used to calculate the average tropical ocean temperature 
histories in Fig. 1d were based on the surface area of the target reconstruction 
regions on an Earth ellipsoid. These areas are: Indian Ocean, 25.5 x 10°km?; 
western Pacific, 26.9 x 10°km?; western Atlantic, 5.1 x 10°km?. The areas of the 
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terrestrial reconstruction regions* (Fig. 1c) are: Arctic, 34.4 x 10°km/?; Europe, 
13.0 x 10°km’; Asia, 31.1 x 10°km’; North America, 12.5 x 10°km”; Australasia, 
37.9 x 10°km’; South America, 20.0 x 10°km’; Antarctica, 34.4 x 10°km’, 

The seasonality of the temperature signal captured by the reconstructions dif- 
fers between regions, owing to the availability of site-level palaeoclimate records 
in each region, some of which capture climate information related to only a spe- 
cific season (for example, a summer growing season for trees). Detailed discus- 
sion on seasonality can be found in refs 4, 14 and 15. To summarize: the tropical 
ocean reconstructions represent April-March (tropical year) annual averages; the 
Arctic, North America and Antarctic reconstructions represent annual averages; 
the Australasia reconstruction represents a September-February half-year (warm 
season) average; and the Europe, Asia and South America reconstructions repre- 
sent local summer averages. We do not expect the seasonal differences between 
the regional reconstructions to affect our interpretations of the onset of sustained, 
significant warming between different regions. Change-point analysis of climate 
model simulations produces near-identical regional warming onsets when data 
are compiled as annual averages across all regions, and as season-specific averages 
that match the reported seasonality for each corresponding palaeoclimate recon- 
struction (Extended Data Fig. 4). 

The effect that uncertainty in the regional temperature reconstructions may 
have on estimates of the onset of industrial-era warming was assessed using 
reconstruction ensembles (Extended Data Fig. 1a). Reconstruction ensembles are 
available for South America, Australasia and the three tropical ocean regions, and 
were calculated as part of the original reconstruction process by using different 
methodological choices based on the proxy network, the temperature calibra- 
tion interval and/or target dataset. We also extend this analysis to reconstruction 
ensembles available for Northern Hemisphere’ and Southern Hemisphere® average 
temperature as an additional test of the apparent Southern Hemisphere delay in the 
onset of industrial-era warming (Extended Data Fig. 1a, Supplementary Fig. 1). 
Uncertainty in the onset of industrial-era warming related to reconstruction 
uncertainty has a 5%—95% range of within —3 yr to +25 yr for the three tropical 
ocean regions (which each have an average +20 range across reconstruction mem- 
bers of approximately 0.1°C during the nineteenth and twentieth centuries). The 
Australasia ensembles have an average +20 range of 0.4°C during the nineteenth 
and twentieth centuries, and there is a —35 yr to +46 yr range (5% —95%) in onset 
estimates determined across this ensemble. The South America ensembles have 
an average +20 range of 0.6°C during the nineteenth and twentieth centuries, 
and a —16yr to +31 yr range (5% —95%) in the onset of industrial-era warming 
(Extended Data Fig. 1a). 

As a first-order estimate, the ranges of onset timings obtained from regions 
for which reconstruction ensembles are available may provide guidance on how 
reconstruction uncertainty affects other regions. The regional reconstructions 
for the Arctic and Europe have a similar magnitude of uncertainty during the 
nineteenth and twentieth centuries to the South America reconstruction (average 
+20 of approximately 0.6 °C). Therefore, uncertainty in the onset of industrial-era 
warming related to regional reconstruction quality may be similar between these 
regions. Reconstruction uncertainty is higher for Asia (approximately 0.9°C for 
+2 root-mean-square error) and Antarctica (approximately 1.2°C for +2 standard 
error), and so we may expect a larger uncertainty range in onset estimates related 
to reconstruction quality for these regions. Reconstruction uncertainty for North 
America is based on decadal-resolution data, and so is not directly comparable to 
the annual resolution of the other regional reconstructions. 

Sample size. No statistical methods were used to predetermine sample size. 

Model output. We compare the regional palaeoclimate reconstructions to a 
multi-model ensemble of transient last-millennium simulations from Ap 850 to 
AD 1850°! (Extended Data Table 2), completed as part of the Fifth Coupled Model 
Intercomparison Project (CMIP5). Historical simulations from the same ensem- 
ble were used to extend the model output from ap 1850 to ap 2005. The CMIP5 
last-millennium and historical experiments use transient radiative forcings that 
include orbital, solar, volcanic, greenhouse and ozone parameters, as well as land- 
use changes!””?. All data were accessed from the Earth System Grid Federation, 
with the exception of the historical portions of the HadCM3 simulation (provided 
by A. Schurer) and the FGOALS-s2 simulation (provided by T. Zhou and W. Man). 

Multiple simulations of the last-millennium run with LOVECLIM* were used 
to examine climate responses to single radiative forcing scenarios (three ensemble 
members each) and to assess intra-model variability in full forcing simulations 
(ten ensemble members). We also examine multiple last-millennium simulations 
of the CSIRO Mk3L coupled climate model run with progressive addition of radi- 
ative forcings (three ensemble members each)". The climate response to green- 
house gas forcing alone was further assessed using single forcing experiments of 
the HadCM3” (four ensemble members) and NCAR CESM1!° (three ensemble 
members) models. 
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For each of the models we examine surface air temperature (tas field in CMIP5 
output files) averaged over the PAGES 2k continental palaeoclimate reconstruction 
regions, and sea surface temperature (tos field in CMIP5 output files) averaged over 
the PAGES Ocean2k tropical ocean reconstruction regions. Monthly resolution 
model output was used to generate annual or season-specific averages that cor- 
respond to each palaeoclimate reconstruction target region. Because we use the 
simulations to examine change-points rather than the magnitude of trends, we do 
not apply any drift corrections to the model output. Change-point analysis was 
performed on individual model runs and then compiled (Fig. 3, Supplementary 
Fig. 2, Extended Data Fig. 8), rather than on ensemble averages, to avoid loss of 
internal variability in the model data. 

Matlab data structures containing the model output (compiled as regional time 
series with annual-average and season-specific resolution) used in this study are 
archived with the NCEI World Data Service for Paleoclimatology at http://www. 
ncdc.noaa.gov/paleo/study/20083. 

Change-point analysis method. We analyse the trends in the PAGES 2k conti- 
nental and tropical ocean temperature reconstructions (Fig. 2a, Extended Data 
Fig. 1a), in the site-level terrestrial and marine palaeoclimate databases (Fig. 4, 
Extended Data Figs 6, 7) and in model simulations (Fig. 3, Extended Data Figs 2, 8, 
Supplementary Figs 2, 3) using the SiZer (SIgnificant ZERo crossings of deriva- 
tives) method”. SiZer determines the sign and significance of trends in time series 
data across different levels of smoothing using a Gaussian kernel filter. Following 
the method used in refs 14 and 23, we assess climate change-points from SiZer 
output by determining the median year of initiation for the most recent significant 
(P <0.1) and sustained trends across smoothing bandwidths spanning all integer 
years in the range 15-50 yr. This range of smoothing levels is designed to reduce 
the influence of interannual-to-decadal climate variability on the detection of a sus- 
tained trend, while avoiding shifting the true change-point time if the smoothing 
window is too long. We assess trends in the time series since AD 1500, and the onset 
of the most recent significant trend is classified only if the sign of the trend persists 
through to the most recent end of the record (that is, a sustained trend). Extended 
Data Fig. 2a shows the SiZer data used to assess the median initiation point for 
recent significant warming trends across the continents and tropical oceans in 
palaeoclimate reconstructions (Fig. 2a). Supplementary Fig. 1 shows the SiZer 
data for multi-model palaeoclimate simulations (Fig. 3a). 

Change-point method testing. Change-points determined by the SiZer method 
were tested on a set of synthetic time series with known warming onset (Extended 
Data Fig. 3). We also compare SiZer estimates for the synthetic warming onset with 
change-points determined using linear change-point methods****. The synthetic 
time series were designed to test the performance of change-point detection meth- 
ods across different forms of long-term trends, in the presence of volcanic-style 
cooling events, and for varying magnitudes and redness (lag-1 autocorrelation) 
of variability superimposed upon the trend. Each test was carried out across 
1,000-member ensembles to generate a distribution of change-point estimates for 
each method. 

Linear change-point detection methods best capture the change-point in syn- 
thetic time series when the long-term trend is derived from two straight lines 
(Extended Data Fig. 3a, series i, ii). However, the SiZer method is more adapt- 
able than linear change-point methods in detecting the true change-point in time 
series in which the long-term trend is a curve rather than a straight line (Extended 
Data Fig. 3a, series iii). This is expected to be advantageous for detecting the ini- 
tial thermodynamic response to increases in atmospheric greenhouse gas levels, 
which have an accelerating trajectory during the industrial era (Fig. 3d). Previous 
research has concluded that, despite the complexity of the climate system, there is 
a near-linear relationship between global radiative forcing changes and the climate 
response’™°°. Hence, we would expect industrial era climate trends to be better 
approximated by a curve than by a simple straight line. 

The addition of synthetic volcanic-style cooling events at the time of, and before, 
the change-point in synthetic series causes only small deviations in SiZer estimates 
of the climate change-point away from the true synthetic change-point (Extended 
Data Fig. 3a, series iv-vi). In our tests with a curved (accelerating) warming trend 
and volcanic-style cooling events centred at 0 yr, —25 yr and —50yr relative to 
the onset of the warming trend, the SiZer method returns median times for the 
warming onset at —1 yr, —9 yr and —13 yr, respectively. This gives us confidence 
that large volcanic eruptions during the early nineteenth century are not likely to 
have substantially skewed the detection of the onset of industrial-era warming 
trends assessed using the SiZer method. 

Finally, we test the sensitivity of change-point detection methods to the addition 
of varying climate variability (autoregressive AR(1) noise) superimposed on the 
same long-term warming trend. As the magnitude of climate variability increases, 
the detection of change-points becomes progressively later using the SiZer method 
(Extended Data Fig. 3b). As a result, climate time series from regions with large 


interannual-to-multi-decadal variability may have delayed detection of the onset of 
industrial-era warming relative to regions with small variability, unless the magni- 
tude of the underlying warming trend is also larger in these regions. In our testing, 
different levels of lag-1 autocorrelation for the AR(1) noise added to an underlying 
trend does not alter the median estimate for the onset of warming; however, the 
range of onset estimates about this median becomes greater as autocorrelation 
increases (Extended Data Fig. 3c). 

Across all of the tests examined here, the linear intersection and Bayesian 
change-point detection methods produce much wider ranges of warming onset 
estimates than those produced using the SiZer method on the same synthetic 
ensembles (Extended Data Fig. 3a—c). The linear methods are also less able to 
detect change-points for trends that are not simple linear functions. On the basis 
of our change-point method testing, we use the SiZer method in this study, because 
it appears to be most adaptable and stable for dealing with the climate changes that 
characterize industrial-era warming. 

We apply our method testing to assess the range of uncertainty in estimates of 

the onset of regional warming related to the SiZer change-point detection method 
(Extended Data Fig. 1b). This is carried out using an accelerating warming trend 
upon which AR(1) noise is added that has the same lag-1 autocorrelation and 
trend-to-variability characteristics as the regional reconstructions. These param- 
eters are estimated by calculating characteristics of residuals about the 15-yr filters 
(as in Fig. 2a) of the regional reconstructions. Uncertainty in onset estimates related 
to the SiZer method is small, typically better than +25 yr (5%-95%). Exceptions 
are Antarctica, for which the small trend relative to variability does not allow for 
the detection of significant trends, and Asia, for which strong lag-1 autocorrelation 
leads to uncertainty in the lower (5%) bound for the onset of warming (Extended 
Data Fig. 1b). 
Emergence testing. We use the regional temperature reconstructions to determine 
the extent to which industrial-era warming has caused regional climates to emerge 
above the level of pre-industrial variability. We choose the interval ap 1622-1799 
as the climate baseline to test emergence against. The starting point (AD 1622) 
represents the earliest year for which temperature reconstructions are available 
for all of the terrestrial and marine regions examined in this study. The end year 
(ap 1799) was chosen as a time well before the onset of industrial-era warming 
in any of the regional reconstructions (Table 1). It is also before strong volcanic 
cooling events associated with the 1809 Unknown and 1815 Tambora eruptions, 
which could skew the reference period towards cooler states. 

Time of emergence is detected when a climate change signal emerges above a 
defined noise threshold”*. Here we use a threshold of 2c of interannual variability 
above the mean of the reference interval (Fig. 2c). We smooth the reconstructions 
using filters with widths of 15-50 yr (the same as for our change-point assessments) 
and calculate the median time when the climate signal (smoothed reconstructions) 
emerges and stays above the noise threshold. The emergence year is quite insen- 
sitive to the level of smoothing we apply to the climate signal across the 15-50-yr 
filter widths, with the 5%-95% range of emergence estimates being between only 
lyr and 8 yr for the regional reconstructions for which climate emergence is found 
to occur. 

A difference between our climate emergence assessment and previously pub- 
lished time-of-emergence studies is that the palaeoclimate reconstructions allow 
us to assess climate emergence using a long baseline interval that occurs entirely 
before the onset of industrial-era warming. Our results demonstrate that, in some 
regions, industrial-era warming has already caused climate to emerge above the 
range of natural variability in the AD 1622-1799 reference interval (Table 1). We 
test the sensitivity of this result to the choice of reference interval and find that 
all tested reference intervals before the onset of industrial-era warming result in 
similar regional emergence patterns and timings, whereas reference periods that 
include parts of the industrial-era warming signal result in later estimates for the 
time of emergence. For example, in the Arctic reconstruction, time-of-emergence 
estimates based on different reference periods are: ap 1947 (ap 1500-1799 
reference), AD 1930 (AD 1600-1799 reference), AD 1930 (AD 1622-1799 reference; 
Table 1), Ap 1938 (AD 1700-1799 reference), AD 1960 (AD 1800-1899 reference) 
and AD 1978 (AD 1850-1899 reference). The overlapping interval of the regional 
reconstructions that we use for our time-of-emergence reference (AD 1622-1799) 
occurs during the time of coolest conditions during the past 2,000 years*!>. Using 
longer palaeoclimate reference intervals that incorporate earlier, warm intervals of 
the past 2,000 years will alter time-of-emergence results, but it is not currently pos- 
sible to perform this assessment consistently between regions, owing to the length 
limitations of currently available tropical ocean temperature reconstructions. 
Instrumental data. We plot temperature trends from gridded instrumental 
datasets in Fig. 1. The surface air temperature datasets are from the Climate 
Research Unit (CRU) TS3.22 product”, or the ERAI-f product*” for surface air 
temperature over Antarctica, and the sea surface temperature datasets are from 
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the HadiSST product**. The CO -equivalent record shown in Fig. 3d is derived 
from ref. 59. 

Code availability. Matlab code for assessing the onset (and sign) of industrial-era 
climate trends using the SiZer method is archived with the NCEI World Data 
Service for Paleoclimatology at http://www.ncdc.noaa.gov/paleo/study/20083. 
These files include the original SiZer package** obtained from http://www.unc. 
edu/~marron/marron_software.html, and additional code to assess the onset of 
sustained, significant temperature trends in annually resolved climate records. 
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Extended Data Figure 1 | Regional distributions of onset estimates 
for industrial-era warming and post-ap 1800 warming trends. 


a, Median (black vertical bars; as in Fig. 2a) onset of sustained, significant 
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detecting a known warming onset based on synthetic tests, for which 
1,000 noise series with lag-1 autocorrelation and trend-to-variability 


characteristics matching the regional reconstructions are applied to an 


warming in regional reconstructions, shown with uncertainty ranges in underlying trend (Methods, Extended Data Fig. 3). Crosses are used for 


onset estimates based on available reconstruction ensembles (colours; 
Methods). Distributions of onset estimates related to reconstruction 
uncertainty denote median (vertical bars), 25%-75% range (boxes) and 


5%-95% range (horizontal lines). Distributions for the onset of warming 
are also calculated for Northern Hemisphere and Southern Hemisphere 


reconstruction ensembles*, with additional details in Supplementary 
Fig. 1. The size of reconstruction ensembles is given (1). b, As ina, but 
for uncertainty ranges in onset estimates based on the SiZer change- 
point detection method. Distributions (grey) denote the uncertainty in 


regions for which low trend-to-variability characteristics (Antarctica) or 
high autocorrelation (Asia) limit the detection of a sustained, significant 
warming trend in the synthetic tests. c, Distribution of century-scale 
(100-yr) linear warming trends since aD 1800 (coloured bars), shown with 
reference to the 5%-95% range of century-scale trends beginning during 
the period ap 1500-1799 (grey shading). Values denote the percentage 

of century-scale trends since AD 1800 that lie below the 5% level (left) 

or above the 95% level (right) of trends beginning during the period 


AD 1500-1799. 
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Extended Data Figure 2 | SiZer trend maps used to assess the onset 

of sustained, significant warming. a, SiZer trend maps”*4 for each of 
the regional land and ocean temperature reconstructions. The timing of 
warming (red) and cooling (blue) trends are calculated at different levels 
of smoothing. Significant (P < 0.1) trends are shown by dark red and dark 
blue shading. Vertical lines indicate the median onset time for the most 
recent phase of sustained, significant warming calculated across 15-50-yr 
filter widths (as used in Fig. 2a; Table 1). b, As in a, but for the ensemble 
mean of regional surface air and surface ocean temperatures across CMIP5 
last-millennium and historical model simulations. The SiZer analysis 


for the multi-model ensemble mean is shown for illustrative purposes 
only and removes the influence of unforced variability to highlight the 
multi-model thermodynamic response to climate forcings since AD 1500. 
The multi-model change-point distributions shown in Fig. 3a are based 
on SiZer analysis of individual experiments (Supplementary Fig. 2), 

not on the ensemble mean shown here. c, d, Radiative climate forcings 
from greenhouse (green), solar (orange) and volcanic (red) sources since 
AD 1500”. Note that the magnitude of short-term forcing from large 


volcanic events!””* exceeds the lower limit of the plot axis. 
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Extended Data Figure 3 | Assessment of change-point detection 
methods using synthetic time series. a, Example synthetic time series 
(grey), consisting of long-term trends with a change-point at year 0 (black) 
and AR(1) noise with a lag-1 autocorrelation of 0.1 and ratio of 100-yr 
trend to 20 noise of 1:0.5. The synthetic trends represent: (i) no trend, then 
linear upward trend; (ii) small linear downward trend, then linear upward 
trend; (iii) small linear downward trend then accelerating upward trend 
(one-quarter of a period of a sine curve); (iv—vi) as in (iii), but with 10-year 
long downward excursions centred at 0 years (iv), —25 years (v) and —50 
years (vi) relative to the onset of the accelerating upward trend. Synthetic 
trends are designed to capture known features of Earth’s climate evolution, 
namely, a long-term gradual pre-industrial cooling trend followed by 
accelerating industrial-era warming with superimposed episodic volcanic 
cooling events. The distributions of change-point results using the SiZer 
method (blue; Methods), the best-fit intersection of two straight lines™ 
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-25 0 25 
Year 
(orange) and a Bayesian linear change-point method™ (purple; screening 
for one change-point and selecting the time of maximum probability) are 
shown for each experiment. Distributions show the median (thick vertical 
bars), 25%-75% range (boxes) and 5%-95% range (horizontal lines) of 
change-points returned across 1,000-member ensembles for each test. 
b, As in a, but testing the influence of different magnitudes of AR(1) noise 
on detecting the onset of warming in an underlying trend (using the small 
linear downward trend then accelerating upward trend, as in test (iii) in a). 
Tests use a ratio of the 100-yr trend to 20 noise of 1:0.2 (vii), 1:0.5 (viii), 
1:1 (ix) and 1:1.5 (x). ¢, As in b, but testing the influence of different AR(1) 
autocorrelation on detecting the onset of warming in an underlying trend. 
Tests use lag-1 autocorrelations of 0.1 (xi), 0.3 (xii), 0.5 (xiii) and 0.7 (xiv). 
See Methods for a detailed discussion of these change-point method tests 
and Extended Data Fig. 1b for application of the SiZer method tests using 
signal-to-noise parameters applicable to the regional reconstructions. 
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Extended Data Figure 4 | Sensitivity of the onset of sustained, 
significant warming to seasonality. Median onset of sustained, 
significant warming in regional reconstructions (grey vertical bars; as in 
Fig. 2a), shown against the median (blue vertical bars), 25%-75% range 
(boxes) and 5%-95% range (horizontal lines) of corresponding regional 
median warming onsets across ten multi-model last-millennium climate 
simulations with full radiative forcings. Dark blue distribution plots 

(as in Fig. 3a) show the onset of warming for regions for which regional 
temperature information from the models has been extracted for annual 
or season-specific intervals that match the climate representation of 

the regional reconstructions (as defined in refs 4, 14). For comparison, 
light blue distributions show the model results for the median onset 

of sustained, significant warming using annual average data for the 
reconstruction regions with a seasonal preference (Europe, Asia, 
Australasia and South America). The similarity between change-point 
results based on annual average and season-specific model data suggests 
that it is unlikely that seasonality plays a role in the regional characteristics 
of the onset of sustained, significant warming described here. JJA, 
June-August; SONDJF; September-February; DJF, December-February. 
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Extended Data Figure 5 | Sensitivity of the onset of sustained, 
significant warming to filter width. Distributions across 15-50-yr filter 
widths for the regional onset of sustained, significant warming (Extended 
Data Fig. 2a) showing the median (grey vertical bars), 25%-75% range 
(grey boxes) and 5%-95% range (grey horizontal lines). Coloured circles 
demonstrate the change-points determined at specific filter widths, with 
markers at Ap 2015 indicating that a sustained recent warming is not 
detected. With the exception of Antarctica, for which significant warming 
is not observed at any filter width, the change-point analysis shows that 
shorter filter widths (less smoothing) yield more recent onset dates for 
sustained warming. This is because decadal-scale variability resets the 
time over which significant warming is determined to have been sustained 
in records with less smoothing. This effect accounts for the wide right- 
side tails produced in warming onset distributions for the Northern 
Hemisphere reconstruction regions for which decadal-scale variability is 
particularly strong (Fig. 2). 
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Extended Data Figure 6 | Site-level onset of sustained, significant in a 2° x 2° grid, the median-of-medians change-point is shown in a or b. 
trends. a, b, The same SiZer-based trend analysis performed on the Symbols are crossed in a and b if the interquartile range (25-75 percentile 
regional reconstructions (Methods, Fig. 2a) was applied to individual range) of change-points within a grid box exceeds 80 years. In all plots, 
temperature-sensitive marine and terrestrial records*'*"'® to determine terrestrial data are shown as squares and marine data are shown as 
the median time of onset of sustained, significant industrial-era warming circles. The onset of site-level warming and cooling trends is compiled by 


(a) or cooling (b) trends. c, Number of records available in a 2° x 2° grid latitudinal band in Fig. 4. 
region (maximum n = 28). Where multiple site-level records are available 
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a 
Mean Upwelling 
Ocean Hemi- Latitude Longitude Depth resolution Seasonalit /non- Préxytvee Database Ref 
basin sphere (°N) (°E) (m) (max/min) y upwelling y typ number” . 
(y sample”) (a priori) 
Pacific Southern -3.53 119.20 -472 4 (7,<1) annual Non Mg/Ca 0335a 60 
Pacific Northern 4.67 -77.96 -2200 12(22,11) annual Non alkenone 1177 61 
Pacific Northern 4.85 -77.61 -884 10(19,9) annual Non alkenone 1178 61 
Pacific Southern -18.33 146.45 -10 5(5,5) annual Non coralSr/Ca 1172 62 
Pacific Northern 34.23 -120.02 -590 1(2,<1) annual Non alkenone 1582 63-65 
Pacific Southern -14.13 -76.50 -299 2(5,1) annual Upwelling alkenone 1571 66 
Pacific Northern 27.90 -111.66 -655 4(9,4) annual Upwelling alkenone 1575 67 
Indian Northern 24.83 65.92 -695 6(15,<1) annual Non alkenone 1574 68 
Atlantic Northern 30.85 -10.10 -355 3(6,<1) annual Upwelling alkenone 0487 37 
Atlantic Northern 10.77 -64.77 -450 1(3,<1) MAM Non Mg/Ca 0039 69 
Atlantic Northern 16.84 -16.73 -323 3(7,0) JASOND Non Mg/Ca 0488 70 
Atlantic Southern -29.14 16.72 -97 4(20,<1) annual Upwelling alkenone 0484 71 
Atlantic Northern 55.50 -13.90 -2543 13(14,11) AMJJ Non Mg/Ca 0058 72 
Atlantic Northern 66.55 -17.42 -470 3(5,1) JJA Non alkenone 0234 73 
Atlantic Northern 38.56 -9.35 -90 2(6,<1) ONDJFMA Non alkenone 1183 74 
Atlantic Northern 10.65 -64.66 -432 5(11,2) annual Upwelling alkenone 1576 67 
Mediterranean Northern 39.85 17.81 -210 4(4,4) NDJFM Non alkenone 1152 75 
Mediterranean Northern 40.50 4.03 -2394 16(28,2) annual Non alkenone 1157 76 
Mediterranean Northern 35.99 -4.75 -1022 6(7,6) annual Upwelling alkenone 1572a Ta 
Mediterranean Northern 36.21 -4.31 -1108 8(12,6) annual Upwelling alkenone 1572b 77 
Arctic Northern 78.92 6.77 -1497 17(18,16) annual Non dinocyst 1147 78 
b Non-upwelling vs upwelling c Tropical vs NH extratropical d MgCa vs Alkenones 
1 1 1 
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Extended Data Figure 7 | Moderately resolved marine records and in a denotes records for which SiZer analysis detects recent significant 
their industrial-era trends. a, Details of the moderate-resolution, SST- cooling trends. The subsets of moderately resolved marine records 
sensitive marine records*”°-’8 compiled by the Ocean2k working group. produce the best differentiation between recent warming and cooling 
See ref. 15 for further details on these records. Acronyms under seasonality trends when the site-level change-point results are grouped according to 
refer to months. b-d, Cumulative distributions of the onset of sustained, non-upwelling (warming) and upwelling (cooling) sites. This suggests that 
significant warming (red; upward) or cooling (blue; downward) in enhanced ocean upwelling during the industrial era may provide a more 
subsets of the moderately resolved marine records. SiZer-based trend robust mechanism to account for recent cooling trends in some localized 
analysis was performed on the site-level records and is compiled using parts of the world oceans, rather than differentiation based on latitude or 
the same subsets examined in the ‘mini-bir’ analysis of ref. 15. Shading geochemical analysis type. NH, Northern Hemisphere. 
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Extended Data Figure 8 | Onset of sustained, significant warming 
trends in climate simulations. The same SiZer-based change-point 
analysis performed on the regional reconstructions (Methods) was applied 
to climate simulations compiled at 5° x 5° grid resolution to determine the 
spatial fingerprint for the onset of industrial-era trends. a, b, The multi- 
model mean for the onset of sustained, significant warming in surface air 
temperature (a) and sea surface temperature (b) calculated for the first 
ensemble member of LOVECLIM, CSIRO Mk3L, CCSM4 and HadCM3 
last-millennium experiments with full transient radiative forcings 

(‘All forcing’). c, d, As in a and b, but for the first ensemble member of 
greenhouse-only (‘GHG-only’) forced experiments of the same models. 
Crosses indicate grid boxes in which one or more models produce recent 
significant cooling rather than warming trends. See Supplementary Fig. 3 


for results from individual models. In the subset of models examined here, 
the fingerprint of warming onset across the oceans is characterized by 
early warming of most tropical ocean regions except the equatorial eastern 
Pacific. Delayed warming—or cooling—occurs in the North Atlantic, 
Arctic and Southern Oceans. The onset of sustained warming is delayed at 
the grid level in many terrestrial parts of the Northern Hemisphere mid- 
to high latitudes, owing to large decadal-scale model variability over these 
regions, a feature that is reduced (but still evident; Extended Data Fig. 5) in 
regionally averaged climate data. The delayed onset of Antarctic warming 
is not evident in this grid-level assessment, consistent with Southern 
Hemisphere data~model disagreement observed in regionally averaged 
data. 
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Extended Data Table 1 | High-resolution, SST-sensitive coral data compiled by the Ocean2k project 
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ect aa re wy ae Site name Coral genus Sampling resolution Proxy type References 
western Pacific 27.10 142.20 Chichijima Porites seasonal Sr/Ca 79 
western Pacific 27.10 142.20 Chichijima Porites seasonal 6°O 79 
western Pacific 13.60 144.83 Double Reef Porites monthly 6°O 80 
western Pacific -8.26 115.57 Bali Porites monthly 6°O 81 
western Pacific -1.50 124.83 Bunaken Porites monthly 60 81 
western Pacific -17.50 -149.83 Moorea Porites annual 5°O 82,83 
western Pacific -5.22 145.82 Madang Porites seasonal 60 84 
western Pacific -4.15 144.88 Liang Porites seasonal 60 84 
western Pacific -2.50 150.5 Kavieng Porites monthly Sr/Ca 85 
western Pacific -4.18 151.98 Rabul Porites monthly 6°O 86 
western Pacific -4.18 151.98 Rabul Porites monthly Sr/Ca 86 
western Pacific -21.24 -159.83 Raratonga* Porites seasonal 60 87 
western Pacific -21.24 -159.83 Raratonga* Porites seasonal Sr/Ca 87 
western Pacific -15.00 166.99 Espiritu Santo Porites annual 60 88 
western Pacific -15.94 166.04 Sabine Bank* Porites monthly 60 89 
western Pacific -22.29 166.27 Amedee Island* Porites monthly Sr/Ca 90,91 
western Pacific -22.48 166.47 Amadee Island Porites seasonal 60 92 
western Pacific -16.82 179.23 Suvasuva Bay Diploastrea annual 60 93 
western Pacific -16.82 179.23 Suvasuva Bay* Porites annual 50 87 
western Pacific -16.82 179.23 Suvasuva Bay Porites annual 87 
western Pacific -21.91 113.97 Ningaloo Reef Porites seasonal 94 
western Pacific -22.10 153.00 Abraham Reef Porites annual 95 


western Pacific 
eastern Pacific 


-19.90 


Clipperton Atoll* 


eastern Pacific -91.23 Urvina Bay* 
eastern Pacific 173.00 Maiana Atoll 
eastern Pacific 172.00 Tarawa Atoll 
eastern Pacific 166.93 Nauru* 

eastern Pacific -82.00 Secas Island 


Palmyra Atoll* 


Malindi 
45.10 Mayotte 
45.10 Mayotte 
43.58 lfaty Reef 
113.77 Houtman Abrolhos 
39.50 Mafia Island 
98.52 Mentawai* 
55.00 Mahe, Seychelles 
40.10 Malindi 
55.00 La Reunion 
34.97 Aquaba* 
34.31 Ras Um Sidd* 
Ras Um Sidd 
western i : Bermuda 
western Atlantic 30.65 -64.99 Bermuda 
western Atlantic 16.20 -61.49 Guadeloupe 
western Atlantic 16.20 -61.49 Guadeloupe 
western Atlantic 17.93 -67.00 Puerto Rico 
western Atlantic 17.93 -67.00 Puerto Rico 
western Atlantic 32.47 -64.70 Bermuda 
western Atlantic 32.47 -64.70 Bermuda 
western Atlantic 24.93 -80.75 Florida Bay 
western Atlantic 25.38 -80.17 Biscayne, Florida 
western Atlantic 25.84 -78.62 Bahamas 
western Atlantic 20.83 -86.74 Yucatan* 
western Atlantic 24.66 -82.83 Dry Tortugas* 


Porites 


Porites 
Porites 
Porites 
Porites 
Porites 
Porites 
Pori 


Porites 
Porites 
Porites 
Porites 
Porites 
Diploastrea 
Porites 
Porites 
Porites 
Porites 
Porites 
Porites 
Porites 
ip 
Diploria 
Diploria 
Diploria 
Montastrea 
Montastrea 
Diploria 
Diploria 
Solenastrea 
Montastrea 
Siderastrea 
Siderastrea 
Siderastrea 


annual 
seasonal 
annual 
seasonal 
monthly 
seasonal 
seasonal 
monthly 
monthly 
annual 
seasona 
seasona 
seasona 
seasona 
seasona 
monthly 
monthly 
monthly 
seasona 
annual 
seasona 
seasona 


annual 
annual 
monthly 
monthly 
annual 
annual 
monthly 
monthly 
annual 
annual 
annual 
annual 
monthly 


ee A NEA go 


5"°O 98 
6°O 99 
60 100,101 
6°O 102 
6°O 103 
60 104 
Sr/Ca 
Fae) 
6°O 108,109 
Sr/Ca 108,109 
60 110 
60 111 
6°O 112 
60 113 
6"°O 114 
6"°O 115 
6°O 116 
60 117 
6°O 118 
60 119 
Sr/Ca 120 
6°O 121 
Sr/Ca 121 
60 122 
Sr/Ca 122 
6"°O 123 
Sr/Ca 123 
6"°O 124 
6°O 125 
growth rate 126 
growth rate 127 
Sr/Ca 128 


See refs 79-128 and the database synthesis in ref. 14 for a detailed discussion of the high-resolution coral records. Shading denotes records for which SiZer analysis detects recent significant cooling 
trends (Extended Data Fig. 6b). Strong multi-decadal variability may account for recent SST cooling signals in site-level coral records from Galapagos, Japan and Bermuda, whereas coral Sr/Ca records 


from Papua New Guinea are believed to have non-climatic influences. 
«Composite of multiple coral records. 
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Extended Data Table 2 | Details of the last-millennium and historical climate simulations 


Sea surface temperature (tos files) 


Surface air temperature (tas files) 


Model - - - - - - - - References 
Last Millennium Historical Last Millennium Historical 
BCC-CSM1.1 past1000_r1i1p1 past1000_rti1p1 past1000_r1i1p1 past1000_r1i1p1 http://forecast.bcccsm.ncc 
-cma.net /web/channel- 
43.htm 
MIROC-ESM past1000_r1i1p1 historical_r1i1p1 past1000_r1i1p1 historical_r1i1p1 429 
IPSL CM5A-LR not available not available past1000_r1i1p1 historical_r1i1p1 http://icmc.ipsl.fr/index.ph 
p/cmip5 

MPI-ESM-P. past1000_r1i1p1 historical_r1i1p1 past1000_r1i1p1 thistorical_r1i1p1 130 
http://www.mpimet.mpg.d 
e/en/science/models/mpi- 

esm.html 

NCAR CCSM4 past1000_r1i1p1 historical_r1i1p1 past1000_r1i1p1 historical_r1i1p1 131 

http://www.cesm.ucar.edu 
/experiments/ 
HadCM3 past1000_r1i1p1 pers. comm past1000_r1i1p1 pers. comm 20 
GISS-E2-R* past1000_r1i1p124 historical_r1i1p124 past1000_r1i1p124 historical_r1i1p124 http://data.giss.nasa.gov/ 
modelE/ar5/ 

FGOALS-s2 past1000_r1i1p1 pers. comm not available not available 

LOVECLIM+ Full forcing (10 member ensemble; e1-e10) Full forcing (10 member ensemble; e1-e10) 18 

LOVECLIM Greenhouse only (3 member ensemble) Greenhouse only (3 member ensemble) 18 

LOVECLIM Solar only (3 member ensemble) Solar only (8 member ensemble) 18 

LOVECLIM Volcanic only (3 member ensemble) Volcanic only (3 member ensemble) 18 

CSIRO-Mk3L v1.2 Orbital + Greenhouse (3 member ensemble) Orbital + Greenhouse (3 member ensemble) 21 

CSIRO-Mk3L v1.2 Orbital + Greenhouse + Solar Orbital + Greenhouse + Solar 

(3 member ensemble) (3 member ensemble) 21 
CSIRO-Mk3L v1.2+ Orbital + Greenhouse + Solar + Volcanic Orbital + Greenhouse + Solar + Volcanic 
(3 member ensemble) (3 member ensemble) 21 
HadCM3 Greenhouse only (4 member ensemble) Greenhouse only (4 member ensemble) 20 
NCAR CESM1 Greenhouse only (3 member ensemble) Greenhouse only (3 member ensemble) 19 


This table includes CMIP5 experiment and ensemble member identification and details of additional idealized forcing experiments!®@!129-13! (see also http://forecast.bcccsm.ncc-cma.net/web/ 
channel-43.htm, http://icmc.ipsl.fr/index.php/cmip5, http://www.mpimet.mpg.de/en/science/models/mpi-esm.html, http://www.cesm.ucar.edu/experiments and http://data.giss.nasa.gov/modelE/ 
ar5). Under CMIP5 protocols®!, last-millennium simulations cover the period AD 850-1850 and historical simulations cover the period AD 1851-2005. The number of models that have run the CMIP5 
transient last-millennium experiment is smaller than the full set of CMIP5 climate models. 

*Experiment 24 of GISS-E2-R last-millennium ensemble is used for the multi-model assessment (Fig. 3a). The GISS-E2-R ensembles use different combinations of datasets for solar forcing, volcanic 
forcing and land-use change, but do not include specialized single or cumulative forcing experiments such as those analysed in Fig. 3b-d. 

tThe first ensemble members of these experiments contribute to the multi-model assessment (Fig. 3a). 
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Genomic insights into the origin of 
farming in the ancient Near East 


losif Lazaridis!?, Dani Nadel’, Gary Rollefson*, Deborah C. Merrett®, Nadin Rohland!, Swapan Mallick), Daniel Fernandes” ®, 
Mario Novak”, Beatriz Gamarra’, Kendra Sirak”!°, Sarah Connell’, Kristin Stewardson!°, Eadaoin Harney!*"!, Qiaomei Fub!?, 
Gloria Gonzalez-Fortes", Eppie R. Jones!, Songtil Alpaslan Roodenberg"*, Gyorgy Lengyel!’”, Fanny Bocquentin'®, 

Boris Gasparian’’, Janet M. Monge”°, Michael Gregg”’, Vered Eshed*!, Ahuva-Sivan Mizrahi*!, Christopher Meiklejohn’, 

Fokke Gerritsen”, Luminita Bejenaru*, Matthias Blither?°, Archie Campbell”°, Gianpiero Cavalleri?’, David Comas’”®, 

Philippe Froguel??°, Edmund Gilbert’, Shona M. Kerr?°, Peter Kovacs*!, Johannes Krause**, Darren McGettigan*, 

Michael Merrigan**, D. Andrew Merriwether*°, Seamus O'Reilly**, Martin B. Richards*°, Ornella Semino*’, 


Michel Shamoon-Pour*’, Gheorghe Stefanescu*®, Michael Stumvoll?°, Anke Tonjes 


25 39,40 
> 


, Antonio Torroni’’, James F. Wilson 


Loic Yengo”’, Nelli A. Hovhannisyan“, Nick Patterson?, Ron Pinhasi’§ & David Reich!,°§ 


We report genome-wide ancient DNA from 44 ancient Near Easterners ranging in time between ~12,000 and 1,400 Bc, 
from Natufian hunter-gatherers to Bronze Age farmers. We show that the earliest populations of the Near East derived 
around half their ancestry from a ‘Basal Eurasian’ lineage that had little if any Neanderthal admixture and that separated 
from other non- African lineages before their separation from each other. The first farmers of the southern Levant (Israel 
and Jordan) and Zagros Mountains (Iran) were strongly genetically differentiated, and each descended from local hunter- 
gatherers. By the time of the Bronze Age, these two populations and Anatolian-related farmers had mixed with each 
other and with the hunter- gatherers of Europe to greatly reduce genetic differentiation. The impact of the Near Eastern 
farmers extended beyond the Near East: farmers related to those of Anatolia spread westward into Europe; farmers 
related to those of the Levant spread southward into East Africa; farmers related to those of Iran spread northward into 
the Eurasian steppe; and people related to both the early farmers of Iran and to the pastoralists of the Eurasian steppe 


spread eastward into South Asia. 


Between 10,000 and 9,000 Bc, humans began practicing agriculture 
in the Near East!. In the ensuing five millennia, plants and animals 
domesticated in the Near East spread throughout West Eurasia (a vast 
region that also includes Europe) and beyond. The relative homoge- 
neity of present-day West Eurasians in a world context suggests the 
possibility of extensive migration and admixture that homogenized 
geographically and genetically disparate sources of ancestry. The spread 
of the world’s first farmers from the Near East would have been a mech- 
anism for such homogenization. To date, however, owing to the poor 
preservation of DNA in warm climates, it has been impossible to study 
the population structure and history of the first farmers and to trace 
their contribution to later populations. 


In order to overcome the obstacle of poor DNA preservation, we 
took advantage of two methodological developments. First, we sampled 
from the inner ear region of the petrous bone* which can yield up 
to ~100 times more endogenous DNA than other skeletal elements’. 
Second, we used in-solution hybridization® to enrich extracted DNA 
for about 1.2 million single nucleotide polymorphism (SNP) targets®”, 
making efficient sequencing practical by filtering out microbial and 
non-informative human DNA. We merged all sequences extracted from 
each individual, and randomly sampled a single sequence with min- 
imum mapping and sequence quality to represent each SNP, restrict- 
ing our investigation to individuals with at least 9,000 SNPs covered 
at least once (Methods). We obtained genome-wide data that passed 
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a Figure 1 | Genetic structure of ancient West 
@ Eurasia. a, Sampling locations and times in 
e ° six regions. Sample sizes for each population 
‘= are given below each bar. E, Early; M, Middle; 
atte eyeeancecne L, Late; HG, hunter-gatherer; N, Neolithic; 
@ ° ChL, Chalcolithic; BA, Bronze Age; IA, Iron 
Pan) oo Age. b, Principal components analysis of 991 
, WER Be Pa present-day West Eurasians (grey points) with 
a 278 projected ancient samples (excluding the 
mx XY re Upper Palaeolithic Ust’-Ishim, Kostenkil4, and 
vin ca ‘ ° aie MA1). To avoid visual clutter, population labels 
op Caucasus of present-day individuals are shown in Extended 
Data Fig. 1. 
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quality control for 45 individuals on whom we had a median coverage 
of 172,819 SNPs. We assembled direct radiocarbon dates on skeletal 
remains from 26 of these individuals (22 newly generated for this study) 
(Supplementary Table 1). 

The newly reported ancient individuals date to ~12,000-1,400 Bc 
and come from the southern Caucasus (Armenia), northwestern 
Anatolia (Turkey), Iran, and the southern Levant (Israel and Jordan) 
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(Supplementary Table 1 and Fig. 1a). (One individual had a radio- 
carbon date that was not in agreement with the date of its archaeo- 
logical context and was also a genetic outlier.) The samples include 
Epipalaeolithic Natufian hunter-gatherers from Ragefet Cave in the 
Levant (~12,000-9,800 Bc); a likely Mesolithic individual (HotulIIb) 
from Hotu Cave in the Alborz mountains of Iran (probable date of 
9,100-8,600 Bc); pre-pottery Neolithic farmers from ‘Ain Ghazal and 
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Figure 2 | Basal Eurasian ancestry explains the reduced Neanderthal 
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negatively correlated to a statistic measuring Neanderthal ancestry 
fi Test, Mbuti; Altai, Denisovan). 


Motza in the southern Levant (~8,300-6,700 Bc); and early farmers 
from Ganj Dareh in the Zagros mountains of western Iran (~8,200- 
7,600 Bc). The samples also include later Neolithic, Chalcolithic 
(~4,800-3,700 Bc), and Bronze Age (~3,350-1,400 Bc) individuals 
(Supplementary Information, section 1). We combined our data with 
previously published ancient data’~!° to form a dataset of 281 ancient 
individuals. We then further merged these data with 2,583 present-day 
people genotyped on the Affymetrix Human Origins array'*'* (238 
newly generated) (Supplementary Table 2 and Supplementary 
Information, section 2). We grouped the ancient individuals on 
the basis of archaeological culture and chronology (Fig. la 
and Supplementary Table 1). We refined the grouping on the basis of 
patterns evident in Principal Components Analysis (PCA)!” (Fig. 1b 
and Extended Data Fig. 1), ADMIXTURE model-based clustering’® 
(Extended Data Fig. 2a), and ‘outgroup’ f3-analysis (Extended Data 
Fig. 3). We used f,-statistics to identify outlier individuals and to clus- 
ter phylogenetically indistinguishable groups into ‘Analysis Labels’ 
(Supplementary Information, section 3). 

We analysed these data to address six questions. (1) Previous work 
has shown that the first European farmers harboured ancestry from 
a Basal Eurasian lineage that diverged from the ancestors of north 
Eurasian hunter-gatherers and East Asians before they separated 
from each other’*. What was the distribution of Basal Eurasian ances- 
try in the ancient Near East? (2) Were the first farmers of the Near 
East part of a single homogeneous population, or were they regionally 
differentiated? (3) Was there continuity between late pre-agricultural 
hunter-gatherers and early farming populations, or were the hunter- 
gatherers largely displaced by a single expansive population, as in early 
Neolithic Europe?® (4) What is the genetic contribution of these early 
Near Eastern farmers to later populations of the Near East? (5) What is 
the genetic contribution of the early Near Eastern farmers to later pop- 
ulations of mainland Europe, the Eurasian steppe, and to populations 
outside West Eurasia? (6) Do our data provide broader insights about 
population transformations in West Eurasia? 


Basal Eurasian and Neanderthal ancestry 

The ‘Basal Eurasians’ are a lineage hypothesized’? to have split off 
before the differentiation of all other Eurasian lineages, including 
eastern non-African populations such as the Han Chinese, and even 
the early diverged lineage represented by the genome sequence of the 
~45,000-year-old Upper Palaeolithic Siberian from Ust’-Ishim!!. To test 
for Basal Eurasian ancestry, we computed the statistic f;( Test, Han; Ust’- 
Ishim, Chimp) (Supplementary Information, section 4), which measures 
the excess of allele sharing of Ust’-Ishim with a variety of Test popula- 
tions compared to Han as a baseline. This statistic is significantly negative 
(Z< —3.7) for all ancient Near Easterners as well as Neolithic and later 
Europeans, consistent with them having ancestry from a deeply divergent 
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Eurasian lineage that separated from the ancestors of most Eurasians 
before the separation of Han and Ust’-Ishim. We used gpAdm (ref. 7) to 
estimate Basal Eurasian ancestry in each Test population. We obtained 
the highest estimates in the earliest populations from both Iran (66 + 13% 
in the likely Mesolithic sample, 48 + 6% in Neolithic samples), and the 
Levant (44 + 8% in Epipalaeolithic Natufians) (Fig. 2), showing that Basal 
Eurasian ancestry was widespread across the ancient Near East. 

West Eurasians harbour significantly less Neanderthal ancestry 
than East Asians!®-?!, which could be explained if West Eurasians (but 
not East Asians) have partial ancestry from a source that diluted their 
Neanderthal inheritance’. Supporting this theory, we observe a nega- 
tive correlation between Basal Eurasian ancestry and the rate of shared 
alleles with Neanderthals!? (Supplementary Information, section 5 and 
Fig. 2). By extrapolation, we infer that the Basal Eurasian population 
had lower Neanderthal ancestry than non-Basal Eurasian populations 
and possibly none (95% confidence interval truncated at zero of 0-60%; 
Fig. 2; Methods). The finding of little if any Neanderthal ancestry in 
Basal Eurasians could be explained if the Neanderthal admixture into 
modern humans ~50,000-60,000 years ago"! largely occurred after the 
splitting of the Basal Eurasians from other non-Africans. 

It is striking that the highest estimates of Basal Eurasian ancestry are 
from the Near East, given the hypothesis that it was there that most 
admixture between Neanderthals and modern humans occurred'®”’, 
This could be explained if Basal Eurasians thoroughly admixed into 
the Near East before the time of the samples we analysed but after the 
Neanderthal admixture. Alternatively, the ancestors of Basal Eurasians 
may have always lived in the Near East, but the lineage of which they 
were a part did not participate in the Neanderthal admixture. 

A population without Neanderthal admixture, basal to other 
Eurasians, may have plausibly lived in Africa. Craniometric analyses 
have suggested an affinity between the Natufians and populations of 
north or sub-Saharan Africa”*4, a result that finds some support from 
Y chromosome analysis showing that the Natufians and successor 
Levantine Neolithic populations carried haplogroup E, likely to be of 
ultimately African origin, which has not been detected in other ancient 
males from West Eurasia”* (Supplementary Information, section 6). 
However, no affinity of Natufians to sub-Saharan Africans is evident 
in our genome-wide analysis, as present-day sub-Saharan Africans do 
not share more alleles with Natufians than with other ancient Eurasians 
(Extended Data Table 1). (We could not test for a link to present-day 
North Africans, who owe most of their ancestry to back-migration from 
Eurasia”*°.) The idea of Natufians as a vector for the movement of 
Basal Eurasian ancestry into the Near East is also not supported by our 
data, as the Basal Eurasian ancestry in the Natufians (44 + 8%) is con- 
sistent with stemming from the same population as that in the Neolithic 
and Mesolithic populations of Iran, and is not greater than in those pop- 
ulations (Supplementary Information, section 4). Further insight into 
the origins and legacy of the Natufians could come from comparison to 
Natufians from additional sites, and to ancient DNA from North Africa. 


Extreme differentiation in the ancient Near East 
PCA on present-day West Eurasian populations (Methods and 
Extended Data Fig. 1), on which we projected the ancient individuals 
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(Fig. 1b), replicates previous findings of a Europe—Near East con- 
trast along the horizontal principal component 1 (PC1) and parallel 
clines (PC2) in both Europe and the Near East”*° (Extended Data 
Fig. 1). Ancient samples from the Levant clustered at one end of the 
Near Eastern cline, and ancient samples from Iran at the other. The 
two Caucasus hunter-gatherers (CHG)? are less extreme along PC1 
than the Mesolithic and Neolithic individuals from Iran, while indi- 
viduals from Chalcolithic Anatolia, Iran, Armenia, and Bronze Age 
Armenia occupy intermediate positions. Qualitatively, the PCA has the 
appearance of a quadrangle whose four corners are some of the oldest 
samples: bottom-left, Western hunter-gatherers (WHG); top-left, 
Eastern hunter-gatherers (EHG); bottom-right, Neolithic Levant and 
Natufians; top-right, Neolithic Iran. This suggests that diverse ancient 
West Eurasians can be modelled as mixtures of as few as four streams 
of ancestry related to these populations, which we confirmed using 
qp Wave (ref. 7) (Supplementary Information, section 7). 

We computed squared allele frequency differentiation between all 
pairs of ancient West Eurasians”’ (Methods; Fig. 3 and Extended Data 
Figs 2b and 4), and found that the populations at the four corners of 
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the quadrangle had differentiation of Fs; =0.08-0.15, comparable to 
the value of 0.09-0.13 seen between present-day West Eurasians and 
East Asians (Han) (Supplementary Table 3). By contrast, by the Bronze 
Age, genetic differentiation between pairs of West Eurasian populations 
had reached its present-day low levels (Fig. 3): today, Fs is <0.025 for 
95% of the pairs of West Eurasian populations and <0.046 for all pairs 
(Supplementary Table 3). These results point to a demographic pro- 
cess that established high differentiation across West Eurasia and then 
reduced this differentiation over time. 


Continuity between hunter-gatherers and early farmers 
Our data document continuity across the transition between hunter- 
gatherers and farmers, separately in the southern Levant and in the 
southern Caucasus-Iran highlands. The qualitative evidence for this 
is that PCA, ADMIXTURE, and outgroup f; analysis cluster Levantine 
hunter-gatherers (Natufians) with Levantine farmers, and Iranian and 
CHG with Iranian farmers (Fig. 1b and Extended Data Figs 1, 3). We 
confirm this in the Levant by showing that its early farmers share signif- 
icantly more alleles with Natufians than with the early farmers of Iran: 
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the statistic f4(Levant_N, Chimp; Natufian, Iran_N) is significantly 
positive (Z = 13.6). The early farmers of the Caucasus-Iran highlands 
similarly share significantly more alleles with the hunter-gatherers of 
this region than with the early farmers from the Levant: the statistic 
fs(iran_N, Chimp; Caucasus or Iran highland hunter-gatherers, 
Levant_N) is significantly positive (Z > 6). 


Admixture in the ancient Near East 

Almost all ancient and present-day West Eurasians have evidence of 
significant admixture between two or more ancestral populations, as 
documented by statistics of the form f3( Test; Reference, Reference) 
which, if negative, show that a test population's allele frequencies tend 
to be an intermediate between two reference populations'® (Extended 
Data Table 2). To better understand the admixture history beyond 
these patterns, we used qpAdm (ref. 7), which can evaluate whether a 
particular test population is consistent with being derived from a set 
of proposed source populations, and if so, infer mixture proportions 
(Methods). We used this approach to carry out a systematic survey of 
ancient West Eurasian populations to explore their possible sources of 
admixture (Fig. 4 and Supplementary Information, section 7). 

Among first farmers, those of the Levant trace approximately two- 
thirds of their ancestry to people related to Natufian hunter-gatherers 
and about one-third to people related to Anatolian farmers 
(Supplementary Information, section 7). Western Iranian first farmers 
cluster with the likely Mesolithic HotulIIb individual and more 
remotely with hunter-gatherers from the southern Caucasus (Fig. 
1b), and share alleles at an equal rate with Anatolian and Levantine 
early farmers (Supplementary Information, section 7), highlighting 
the long-term isolation of western Iran. 

During subsequent millennia, the early farmer populations of the 
Near East expanded in all directions and mixed, as we can model 
populations of the Chalcolithic and subsequent Bronze Age only as 
having ancestry from two or more sources. The Chalcolithic people of 
western Iran can be modelled as a mixture of the Neolithic people of 
western Iran, the Levant and CHG, consistent with their position in the 
PCA (Fig. 1b). Admixture from populations related to the Chalcolithic 
people of western Iran had a wide impact, consistent with contrib- 
uting around 44% of the ancestry of Levantine Bronze Age popula- 
tions in the south and about 33% of the ancestry of the Chalcolithic 
North-West Anatolians in the west. Our analysis shows that the ancient 
populations of Chalcolithic Iran, Chalcolithic Armenia, Bronze Age 
Armenia and Chalcolithic Anatolia were all composed of the same 
ancestral components, albeit in slightly different proportions (Fig. 4b 
and Supplementary Information, section 7). 


Admixture into Europe, East Africa and South Asia 
Admixture did not only occur within the Near East but also extended 
towards Europe. To the north, a population related to people of 
Chalcolithic Iran contributed about 43% of the ancestry of early Bronze 
Age populations of the steppe. The spread of Near Eastern ancestry 
into the Eurasian steppe was previously inferred’ without access to 
ancient samples, with a population related to present-day Armenians as 
a suggested source”®. To the west, the early farmers of mainland Europe 
were descended from a population related to Neolithic North-Western 
Anatolians®. This is consistent with an Anatolian origin of farming in 
Europe, but does not reject other sources, as the spatial distribution 
of the Anatolian/European-like farmer populations is unknown. We 
can rule out the hypothesis that European farmers stem directly from 
a population related to the ancient farmers of the southern Levant”®”?, 
however, because European farmers share more alleles with Anatolian 
Neolithic farmers than with Levantine farmers, as attested by the pos- 
itive statistic f4(Europe_EN, Chimp; Anatolia_N, Levant_N) (Z=15). 
Migration from the Near East also occurred towards the southwest into 
East African populations, which experienced West Eurasian admixture 
around 1,000 Bc*”?!, Previously, the West Eurasian population known 
to be the best proxy for this ancestry was present-day Sardinians*!, who 
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resemble Neolithic Europeans genetically'***. However, our analysis 
shows that East African ancestry is significantly better modelled by 
Levantine early farmers than by Anatolian or early European farmers, 
implying that the spread of this ancestry to East Africa was not from the 
same group that spread Near Eastern ancestry into Europe (Extended 
Data Fig. 5 and Supplementary Information, section 8). 

In South Asia, our dataset provides insight into the sources of 
Ancestral North Indians (ANI), a West Eurasian-related population 
that no longer exists in unmixed form but contributes a variable amount 
of the ancestry of South Asians**** (Supplementary Information, 
section 9 and Extended Data Fig. 5). We show that it is impossible to 
model the ANI as being derived from any single ancient population in 
our dataset. However, it can be modelled as a mix of ancestries related 
to both early farmers of western Iran and people of the Bronze Age 
Eurasian steppe; all sampled South Asian groups are inferred to have 
significant amounts of both ancestral types. The demographic impact of 
steppe-related populations on South Asia was substantial, as the Mala, 
a south Indian population with minimal ANI along the ‘Indian Cline’ 
of such ancestry***4, is inferred to have around 18% steppe-related 
ancestry, while the Kalash of Pakistan are inferred to have about 50%, 
similar to present-day northern Europeans’. 


Population transformations in West Eurasia and beyond 
We were concerned that our conclusions might be biased by the par- 
ticular populations we happened to sample, and that we would have 
obtained qualitatively different conclusions without data from some key 
populations. We tested our conclusions by plotting the inferred position 
of admixed populations in PCA against a weighted combination of 
their inferred source populations and obtained qualitatively consistent 
results (Extended Data Fig. 6). 

To further assess the robustness of our inferences, we developed a 
method to infer the existence and genetic affinities of ancient pop- 
ulations from unobserved ‘ghost’ populations (Supplementary 
Information, section 10 and Extended Data Fig. 7). This method 
takes advantage of the insight that if an unsampled ghost population 
admixes with differentiated ‘substratun’ populations, it is possible 
to extrapolate its identity by intersecting clines of populations with 
variable proportions of ghost and substratum ancestry. Applying this 
approach while withholding major populations, we validated some of 
our key inferences, successfully inferring mixture proportions consist- 
ent with those obtained when the populations were included in the 
analysis. Application of this method highlights the impact of Ancient 
North Eurasian (ANE) ancestry related to the ~22,000 Bc Mal’ta 1 and 
~15,000 Bc Afontova Gora 2 (ref. 15) on populations living in Europe, 
the Americas and Eastern Eurasia. Eastern Eurasians can be modelled 
as arrayed along a cline with different proportions of ANE ancestry 
(Supplementary Information, section 11 and Extended Data Fig. 8), 
ranging from about 40% ANE in Native Americans, matching previ- 
ous findings’*!°, to no less than around 5-10% ANE in diverse East 
Asian groups including Han Chinese (Extended Data Figs 5, 7f). We 
also document a cline of ANE ancestry across the East-West extent of 
Eurasia. Eastern hunter-gatherers (EHG) derive about three-quarters 
of their ancestry from the ANE (Supplementary Information, section 
11); Scandinavian hunter-gatherers”*!? (SHG) are a mix of EHG 
and WHG; and WHG are a mix of EHG and populations related to 
the Upper Palaeolithic Bichon from Switzerland (Supplementary 
Information, section 7). Northwest Anatolians—with ancestry from 
a population related to European hunter-gatherers (Supplementary 
Information, section 7)—are better modelled if this ancestry is taken as 
more extreme than Bichon (Supplementary Information, section 10). 

The population structure of the ancient Near East was not inde- 
pendent of that of Europe (Supplementary Information, section 4), 
as evidenced by the highly significant (Z = —8.9) statistic f,(Iran_N, 
Natufian;WHG, EHG) which suggests gene flow in ‘northeasterr’ 
(Neolithic Iran/EHG) and ‘southwestern (Levant/WHG) interaction 
spheres (Fig. 4d). This interdependence of the ancestry of Europe and 
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the Near East may have been mediated by unsampled geographically 
intermediate populations* that contributed ancestry to both regions. 


Conclusions 

By analysing genome-wide ancient DNA data from ancient individuals 
from the Levant, Anatolia, the southern Caucasus and Iran, we have 
provided a first glimpse into the demographic structure of the human 
populations that transitioned to farming. We reject the hypothesis that 
the spread of agriculture in the Near East was achieved by the dis- 
persal of a single farming population displacing the hunter-gatherers 
they encountered. Instead, the spread of ideas and farming technology 
moved faster than the spread of people, as we can determine from the 
fact that the population structure of the Near East was maintained 
throughout the transition to agriculture. A priority for future ancient 
DNA studies should be to obtain data from older periods, which would 
reveal the deeper origins of the population structure in the Near East. 
It will also be important to obtain data from the ancient civilizations 
of the Near East to bridge the gap between the region’s prehistoric 
inhabitants and those of the present. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Ancient DNA data. In a dedicated ancient DNA laboratory at University College 
Dublin, we prepared powder from 132 ancient Near Eastern samples, either by 
dissecting the inner ear region of the petrous bone using a sandblaster (Renfert), 
or by drilling using a Dremel tool and single-use drill bits and selecting the best 
preserved bone fragments based on anatomical criteria. These fragments were then 
powdered using a mixer mill (Retsch Mixer Mill 400)*. 

We performed all subsequent processing steps in a dedicated ancient DNA 
laboratory at Harvard Medical School, where we extracted DNA from the powder 
(usually 75 mg, range 14-81 mg) using an optimized ancient DNA extraction 
protocol*®, but replaced the assembly of Qiagen MinElute columns and extension 
reservoirs from Zymo Research with a High Pure Extender Assembly from the 
High Pure Viral Nucleic Acid Large Volume Kit (Roche Applied Science). We built 
a total of 170 barcoded double-stranded Illumina sequencing libraries for these 
samples*’, of which we treated 167 with uracil-DNA glycosylase (UDG) to remove 
the characteristic C-to-T errors of ancient DNA**. The UDG treatment strategy is 
(by-design) inefficient at removing terminal uracils, allowing the mismatch rate 
to the human genome at the terminal nucleotide to be used for authentication*”. 
We updated this library preparation protocol in two ways compared to the original 
publication: first, we used 16U Bst2.0 Polymerase, Large Fragment (NEB) and 
1x Isothermal amplification buffer (NEB) in a final volume of 251 fill-in reaction, 
and second, we used the entire inactivated 251 fill-in reaction in a total volume 
of 10011 PCR mix with 141M of each primer*’. We included extraction negative 
controls (where no sample powder was used) and library negative controls (where 
extract was supplemented by water) in every batch of samples processed and carried 
them through the entire wet laboratory processing to test for reagent contamination. 

We screened the libraries by hybridizing them in solution to a set of oligonu- 
cleotide probes tiling the mitochondrial genome”, using the protocol described 
previously’. We sequenced the enriched libraries using an Illumina NextSeq 500 
instrument using 2x 76 bp reads, trimmed identifying sequences (seven base pair 
molecular barcodes at either end) and any trailing adapters, merged read pairs that 
overlapped by at least 15 base pairs, and mapped the merged sequences to the RSRS 
mitochondrial DNA reference genome“, using the Burrows Wheeler Aligner” 
(bwa) and the command samse (v0.6.1). 

We enriched promising libraries for a targeted set of ~1.2 million SNPs® 
as in ref. 5, and adjusted the blocking oligonucleotide and primers to be appropriate 
for our libraries. The specific probe sequences are given in supplementary data 2 of 
ref. 7. and supplementary data 1 of ref. 6. We sequenced the libraries on an Illumina 
NextSeq 500 using 2x 76 bp reads. We trimmed identifying sequences (molecular 
barcodes) and any trailing adapters, merged pairs that overlapped by at least 15 
base pairs (allowing up to one mismatch), and mapped the merged sequences to 
hg19 using the single-ended aligner samse in bwa (v0.6.1). We removed duplicated 
sequences by identifying sets of sequences with the same orientation and start and 
end positions after alignment to hg19; we picked the highest quality sequence to 
represent each set. For each sample, we represented each SNP position by a ran- 
domly chosen sequence, restricting to sequences with a minimum mapping quality 
(MAPQ > 10), sites with a minimum sequencing quality (>20), and removing two 
bases at the ends of reads. We sequenced the enriched products up to the point that 
we estimated that generating a hundred new sequences was expected to add data 
on less than about one new SNP*. 

Testing for contamination and quality control. For each ancient DNA library, 
we evaluated authenticity in several ways. First, we estimated the rate of matching 
to the consensus sequence for mitochondrial genomes sequenced to a coverage of 
at least tenfold from the initial screening data. Of the 76 libraries that contributed 
to our dataset (coming from 45 samples), 70 had an estimated rate of sequencing 
matching to the consensus of >95% according to contamMix° (the remaining 
libraries had estimated match rates of 75-92%, but gave no sign of being outliers 
in principal component analysis or X-chromosome contamination analysis so we 
retained them for analysis) (Supplementary Table 1). We quantified the rate of 
C-to-T substitution in the final nucleotide of the sequences analysed, relative to 
the human reference genome sequence, and found that all the libraries analysed 
had rates of at least 3% (ref. 37), consistent with genuine ancient DNA. For the 
nuclear data from males, we used the ANGSD software*? to obtain a conservative 
X-chromosome estimate of contamination. We determined that all libraries that 
passed our quality control and for which we had sufficient X-chromosome data 
to make an assessment, had contamination rates of 0-1.5%. Finally, we merged 
data for samples for which we had multiple libraries to produce an analysis dataset. 
Affymetrix Human Origins genotyping data. We genotyped 238 present-day 
individuals from 17 diverse West Eurasian populations on the Affymetrix Human 
Origins array!®, and applied quality control analyses as previously described! 
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(Supplementary Table 2). We merged the newly generated data with data from 
2,345 individuals previously genotyped on the same array’’. All individuals that 
were genotyped provided individual informed consent consistent with studies of 
population history, following protocols approved by the ethical review committees 
of the institutions of the researchers who collected the samples. The collection 
and analysis of genome-wide data on anonymized samples at Harvard Medical 
School for the purpose of studying population history was approved by the Harvard 
Human Research Protection Program, protocol 11681, re-reviewed on 12 July 2016. 
Anonymized aliquots of DNA from all individuals were sent to the core facility of 
the Center for Applied Genomics at the Children’s Hospital of Philadelphia for gen- 
otyping and data processing. For 127 of the individuals with newly reported data, 
the informed consent was consistent with public distribution of data, and the data 
can be downloaded at http://genetics.med.harvard.edu/reich/Reich_Lab/Datasets. 
html. To access data for the remaining 111 newly reported samples, researchers 
should send a signed letter to D.R. containing the following text: “(a) I will not 
distribute the data outside my collaboration; (b) I will not post the data publicly; 
(c) I will make no attempt to connect the genetic data to personal identifiers for 
the samples; (d) I will use the data only for studies of population history; (e) I will 
not use the data for any selection studies; (f) I will not use the data for medical 
or disease-related analyses; (g) I will not use the data for commercial purposes.” 
Supplementary Table 2 specifies which samples are consistent with which type of 
data distribution. 

Datasets. We carried out population genetic analysis on two datasets: (i) HO 
includes 2,583 present-day humans genotyped on the Human Origins array'*'° 
including 238 newly reported, (Supplementary Table 2; Supplementary 
Information, section 2), and 281 ancient individuals on a total of 592,146 autoso- 
mal SNPs. (ii) HOIII includes the 281 ancient individuals on a total of 1,055,186 
autosomal SNPs, including those present in both the Human Origins and Illumina 
genotyping platforms, but excluding SNPs on the sex chromosomes or additional 
SNPs of the 1,240k capture array that were included because of their potential 
functional importance®. We used HO for analyses that involve both ancient and 
present-day individuals, and HOIII for analysis on ancient individuals alone. We 
also used 235 individuals from Pagani et al.*° genotyped at 418,700 autosomal 
SNPs to study admixture in East Africans (Supplementary Information, section 
8). Ancient individuals are represented in ‘pseudo-haploid’ form by randomly 
choosing one allele for each position of the array. 

Principal components analysis. We carried out principal components analysis 
in the smartpca program of EIGENSOFT”’, using default parameters and the 
Isqproject: YES!’ and numoutlieriter: 0 options. We carried out PCA on the HO 
dataset for 991 present-day West Eurasians (Extended Data Fig. 1), and projected 
the 278 ancient individuals (Fig. 1b). 

ADMIXTURE analysis. We carried out ADMIXTURE analysis!* of the HO 
dataset after pruning for linkage disequilibrium in PLINK**5 with parameters 
indep-pairwise 200 25 0.4, which retained 296,309 SNPs. We performed analysis 
in 20 replicates with different random seeds, and retained the highest likelihood 
replicate for each value of K. We show the K= 11 results for the 281 ancient samples 
in Extended Data Fig. 2a (this is the lowest K for which components maximized in 
European hunter-gatherers, ancient Levant, and ancient Iran appear). 
f-statistics. We carried out analysis of f;-statistics, f,-ratio, and f,-statistics statistics 
using the ADMIXTOOLS" programs qp3Pop, qpF4ratio with default parame- 
ters, and qpDstat with f4mode: YES, and computed standard errors with a block 
jack-knife**, For computing f;-statistics with an ancient population as a target, 
we set the inbreed: YES parameter. We computed f-statistics on the HOIII dataset 
when no present-day humans were involved and on the HO dataset when they 
were. We computed the statistic f,(Test, Mbuti; Altai, Denisovan) in Fig. 2 on 
the HOIII dataset after merging with whole genome data on 3 Mbuti individuals 
from Panel C of the Simons Genome Diversity Project*”. We computed the den- 
drogram of Extended Data Fig. 3 showing hierarchical clustering of populations 
with outgroup f3-statistics using the open source heatmap.2 function of the gplots 
package in R. 

Negative correlation of Basal Eurasian ancestry with Neanderthal ancestry. We 
used the /m function of R to fit a linear regression of the rate of allele sharing of a 
Test population with the Altai Neanderthal as measured by f,(Test, Mbuti; Altai, 
Denisovan) as the dependent variable, and the proportion of Basal Eurasian ances- 
try (Supplementary Information, section 4) as the predictor variable. Extrapolating 
from the fitted line, we obtain the value of the statistic expected if Test is a popu- 
lation of 0% or 100% Basal Eurasian ancestry. We then compute the ratio of the 
Neanderthal ancestry estimate in Basal Eurasians relative to non-Basal Eurasians 
as f4(100% Basal Eurasian, Mbuti; Altai, Denisovan)/ f;(0% Basal Eurasian, Mbuti; 
Altai, Denisovan). We use a block jack-knife*®, dropping one of 100 contiguous 
blocks of the genome at a time, to estimate the value and standard error of this 
quantity (9 + 26%). We compute a 95% confidence interval based on the point 
estimate + 1.96-times the standard error: —42 to 60%. We truncated to 0-60% 
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on the assumption that Basal Eurasians had no less Neanderthal admixture than 
Mbuti from sub-Saharan Africa. 

Estimation of Fsy coefficients. We estimated Fsr in smartpca’’ with default para- 
meters, inbreed: YES, and fstonly: YES. 

Admixture graph modelling. We carried out Admixture Graph modelling with 
the qpGraph software!® using Mbuti as an outgroup unless otherwise specified. 
Testing for the number of streams of ancestry. We used the qp Wave*>* software, 
described in Supplementary Information, section 10 of ref. 7, to test whether a set of 
‘Left’ populations is consistent with being related via as few as N streams of ancestry 
to a set of ‘Right’ populations by studying statistics of the form X(u, v) = Fy(uo, u; 
Vo, V) where uo, Vo are basis populations chosen from the ‘Left’ and ‘Right’ sets 
and u, v are other populations from these sets. We use a Hotelling’s T” test** to 
evaluate whether the matrix of size (L—1)*(R—1), where L, R are the sizes of the 
‘Left’ and ‘Right’ sets has rank m. If this is the case, we can conclude that the ‘Left’ 
set is related via at least N= m-+1 streams of ancestry differently to the ‘Right set. 
We use the parameter allsnps: YES which computes each f,-statistic based on the 
full set of SNPs with coverage among the four populations used in the statistic 
(without regard to whether the SNPs are covered in the other populations in the 
‘Left’ and ‘Right sets). 

Inferring mixture proportions without an explicit phylogeny. We used the 
qpAdm methodology described in Supplementary Information, section 10 of ref. 7 
to estimate the proportions of ancestry in a Test population deriving from a mixture 
of N ‘reference’ populations by exploiting (but not explicitly modelling) shared 
genetic drift with a set of ‘Outgroup populations (Supplementary Information, 
section 7). We set the details: YES parameter, which reports a normally distributed 
Z-score estimated with a block jack-knife for the difference between the statistics 
fa(uo, Test; vo, v) and fx(uo, Estimated Test; vo, v) where Estimated Test is 
YN, aif, (uo. Ref; vo, v), the average of these f-statistics weighed by the mixture 


proportions a; from the N reference populations. We use the allsnps: YES 
parameter. 

Modelling admixture from ghost populations. We model admixture from a 
‘ghost’ (unobserved) population X in the specific case that X has part of its ancestry 
from two unobserved ancestral populations p and q. Any population X composed 


of the same populations p and q resides on a line defined by two observed reference 
populations r, and r. composed of the same elements p and q according to a 
parametric equation x = 7, + A(r2 — r)) with real-valued parameter . We define 
and solve the optimization problem of fitting \ and obtain mixture proportions 
(Supplementary Information, section 10). 

Code availability. Code implementing the newly developed method for modelling 
admixture from ghost populations is available on request from I.L. 
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Extended Data Figure 1 | Principal components analysis of 991 present-day West Eurasians. The PCA analysis is performed on the same set of 
individuals as are reported in Fig. 1b, using EIGENSOFT. Here, we colour the samples by population (to highlight the present-day populations) instead of 
using grey points as in Fig. 1b (where the goal is to highlight ancient samples). 
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Extended Data Figure 2 | Genetic structure in ancient West Eurasian 
populations across time and decline of genetic differentiation over 
time. a, ADMIXTURE model-based clustering analysis of 2,583 present- 
day humans and 281 ancient samples; we show the results only for ancient 


and select present-day populations. 
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Extended Data Figure 3 | Outgroup f;(Mbuti; X, Y) for pairs of ancient early Anatolian and European farmers; European hunter-gatherers, Steppe 


populations. The dendrogram is plotted for convenience and should not populations and populations admixed with steppe ancestry; populations 
be interpreted as a phylogenetic tree. Areas of high shared genetic drift from the Levant from the Epipalaeolithic (Natufians) to the Bronze Age; 
are ‘yellow’ and include from top-right to bottom-left along the diagonal: populations from Iran from the Mesolithic to the Late Neolithic. 
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Extended Data Figure 5 | West Eurasian related admixture in East 
Africa, Eastern Eurasia and South Asia. a, Levantine ancestry in Eastern 
Africa in the Human Origins dataset. b, Levantine ancestry in different 
Eastern African population in the dataset from Pagani et al. (2012); the 
remainder of the ancestry is a clade with Mota, a ~4,500 year old sample 


ARTICLE 


Pagani et al. (2012) 


Tygray —e— 
Amhara —-e— 
Afar —-e— 
Oromo —e— 
Somali —-e— 
Esomali —-e— 
Wolayta —e— 
-e— Aricultivator 
——*e— Ariblacksmith 
—*— Gumuz 
4 Anuak 


——*i— Sudanese 


Selkup ———e—— 
i Tubalar. ——e—— 
4 Kyrgyz ———e——— 
i Altaian ——2*—— 
Yukagir ——»—— 
= —— Dolgan 
——*— Kalmyk 
——+— Tinian 


——e—— Itelmen 
——e— karyak 
——e—— Kusunda 


ei 
——e— Japanese 


+ Kinh 


from Ethiopia”. c, EHG ancestry in Eastern Eurasians. d, Afontova Gora 
(AG2)-related ancestry in Eastern Eurasians; the remainder of their ancestry 
is a clade with Onge. e, Mixture proportions for South Asian populations 
showing that they can be modelled as having West Eurasian-related ancestry 
similar to that in populations from both the Eurasian steppe and Iran. 
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Extended Data Figure 6 | Inferred position of ancient populations in West Eurasian PCA according to the model of Fig. 4. 
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Extended Data Figure 7 | Admixture from ghost populations using 
‘cline intersection’. a-f, We model each Test population (purple) as 

a mixture (pink) of a fixed reference population (blue) and a ghost 
population (orange) residing on the cline defined by two other populations 
(red and green) according to the visualization method of Supplementary 
Information, section 10. a, Early/Middle Bronze Age steppe populations 
are a mixture of Iran_ChL and a population on the WHG—SHG cline. 

b, Scandinavian hunter-gatherers (SHG) are a mixture of WHG anda 
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population on the Iran_ChL—Steppe_EMBA cline. c, Caucasus hunter- 
gatherers (CHG) are a mixture of Iran_N and both WHG and EHG. 

d, Late Neolithic/Bronze Age Europeans are a mixture of the preceding 
Europe_MNCAL population and a population with both EHG and 
Iran_ChL ancestry. e, Somali are a mixture of Mota?’ anda population on 
the Iran_ChL—Levant_BA cline. f, Eastern European hunter-gatherers 
(EHG) are a mixture of WHG and a population on the Onge—Han cline. 
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Palaeolithic Siberians Mal’ta 1 (MA1) and Afontova Gora 2 (AG2) are 3(Mbuti; Switzerland_HG, Test). 
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Extended Data Table 1 | No evidence for admixture related to sub-Saharan Africans in Natufians 


Other Ancient = African f4(Natufian, Other Ancient; African, Chimp) Z Number of SNPs 
EHG Mbuti -0.00044 -1.0 254033 
EHG Yoruba 0.00029 0.7 254033 
EHG Ju_hoan_North -0.00015_ -0.4 254033 
EHG Mota -0.00022 -0.4 253986 
WHG Mbuti -0.00067 -1.7 261514 
WHG Yoruba -0.00045 -1.1 261514 
WHG Ju_hoan_North -0.00046 -1.2 261514 
WHG Mota -0.00129 -2.3 261461 
SHG Mbuti -0.00076  -2.0 255686 
SHG Yoruba -0.00039 -1.0 255686 
SHG Ju_hoan_North -0.00052 -1.4 255686 
SHG Mota -0.00091  -1.7 255641 
Switzerland_HG = Mbuti -0.00018  -0.4 261322 
Switzerland_HG Yoruba 0.00019 0.4 261322 
Switzerland_HG Ju_hoan_North 0.00009 0.2 261322 
Switzerland_HG Mota -0.00062 -0.9 261276 
Kostenki14 Mbuti 0.00034 0.7 246765 
Kostenki14 Yoruba 0.00120 2.3 246765 
Kostenki14 Ju_hoan_North 0.00069 1.4 246765 
Kostenki14 Mota 0.00036 0.5 246719 
MA1 Mbuti -0.00038 -0.7 191819 
MA1 Yoruba 0.00009 0.2 191819 
MA1 Ju_hoan_North -0.00010 -0.2 191819 
MA1 Mota -0.00038 -0.5 191782 
CHG Mbuti -0.00051 _-1.2 261505 
CHG Yoruba -0.00012 -0.3 261505 
CHG Ju_hoan_North -0.00013  -0.3 261505 
CHG Mota -0.00042 -0.7 261456 
Iran_N Mbuti -0.00018  -0.4 232927 
Iran_N Yoruba 0.00036 0.8 232927 
Iran_N Ju_hoan_North 0.00041 0.9 232927 
Iran_N Mota 0.00006 __—0.1 232880 


We computed the statistic f4(Natufian, Other Ancient; African, Chimp) varying African to be Mbuti, Yoruba, Ju_hoan_North, or the ancient Mota individual. Gene flow between Natufians and African 
populations would be expected to bias these statistics positive. However, we find most of them to be negative in sign and all of them to be non-significant (|Z| <3), providing no evidence that 
Natufians differ from other ancient samples with respect to African populations. 
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Extended Data Table 2 | Admixture f3-statistics 


Test Reference, Reference, f,(Test; Reference,, Refrence2) Z-score Number of SNPs 
Anatolia_N Iberia_BA Levant_N -0.00034 -0.2 111632 
Armenia_ChL EHG Levant_N -0.00249 -1.5 167020 
Armenia_EBA Anatolia_N CHG -0.01017 -7.9 195596 
Armenia_MLBA Anatolia_N Steppe_EMBA -0.00809 -7.3 203796 
CHG Anatolia_ChL ran_Hotulllb 0.02612 3.6 9884 
EHG Steppe_Eneolithic Switzerland_HG -0.00282 -0.9 67938 
Europe_EN Anatolia_N WHG -0.00494 -11.2 380684 
Europe_LNBA Europe_MNChL Steppe_EMBA -0.00920 -41.8 414782 
Europe_MNChL Anatolia_N WHG -0.01351 -26.8 363672 
Iran_ChL Anatolia_N ran_N -0.01285 -10.6 167941 
lran_N Iran_LN Gana -0.00462 -1.1 17804 
Levant_BA Iran_N Levant_N -0.00853 4.7 118269 
Levant_N Europe_MNChL Natufian -0.00671 -3.6 61845 
Natufian Iberia_BA ran_Hotulllb 0.07613 3.4 1054 
SHG Steppe_Eneolithic Switzerland_HG 0.00728 3.2 154825 
Steppe_EMBA EHG Abkhasian -0.00756 -11.2 349359 
Steppe_Eneolithic EHG ran_LN -0.01637 4.2 25100 
Steppe_MLBA Europe_MNChL Steppe_EMBA -0.00573 -18.0 378298 
WHG Switzerland_HG Saudi -0.01562 -7.7 218758 
Abkhasian CHG Sardinian -0.00754 -13.1 387956 
Adygei Anatolia_N Eskimo -0.00699 -14.4 413128 
Albanian Europe_EN Burusho -0.00650 -16.8 395851 
Armenian Anatolia_N Sindhi -0.00603 -19.5 406021 
Assyrian lran_N Sardinian -0.00672 -11.8 309055 
Balkar Anatolia_N Chukchi -0.00975 -18.8 401928 
Basque Switzerland_HG Druze -0.00726 -12.6 416070 
BedouinA Europe_EN Yoruba -0.01584 42.8 460762 
BedouinB lran_Hotulllb Natufian 0.01384 41 32266 
Belarusian WHG Iranian -0.00974 -19.8 392363 
Bulgarian Anatolia_N Steppe_EMBA -0.00807 -26.7 400263 
Canary_lIslander Europe_MNChL Mende -0.00829 -5.9 353172 
Chechen Anatolia_N Eskimo -0.00440 -7.9 396678 
Croatian WHG Druze -0.00871 -18.6 394032 
Cypriot Anatolia_N Sindhi -0.00562 -16.1 401141 
Czech SHG Druze -0.00919 -21.7 374705 
Druze Iran_N Sardinian -0.00269 -5.8 343813 
English Steppe_EMBA Sardinian -0.00628 -20.6 402502 
Estonian SHG Druze -0.00789 -17.6 371575 
Finnish SHG Assyrian -0.00716 -12.6 355744 
French Steppe_EMBA Sardinian -0.00669 -37.9 441807 
Georgian CHG Sardinian -0.00782 -13.7 390744 
German WHG Druze -0.01103 -22.9 391302 
Greek Europe_EN Pathan -0.00600 -30.0 421984 
Hungarian Steppe_EMBA Sardinian -0.00644 -31.2 420017 
icelandic WHG Abkhasian -0.00974 -17.0 394625 
ranian Anatolia_N Sindhi -0.00594 -30.9 443011 
rish Steppe_EMBA Sardinian -0.00590 -22.8 416663 
rish_Ulster SHG Assyrian -0.00909 -15.6 350547 
talian_North Europe_EN Steppe_EMBA -0.00627 -26.4 419169 
talian_South Iberia_BA lran_Hotulllb 0.01224 2.6 17678 
Jew_Ashkenazi Anatolia_N Koryak -0.00532 -9.4 389012 
Jew_Georgian Iran_N Sardinian -0.00306 4.2 292410 
Jew_lIranian Iran_N Sardinian -0.00385 -5.8 302446 
Jew_lIraqi Iran_N Sardinian -0.00486 -6.5 287673 
Jew_Libyan Europe_EN Yoruba -0.00397 -7.2 415797 
Jew_Moroccan Europe_EN Yoruba -0.00649 -10.9 405193 
Jew_Tunisian Anatolia_N Mende -0.00276 -4.1 399354 
Jew_Turkish Anatolia_N Burusho -0.00571 -16.4 405254 
Jew_Yemenite Natufian Kalash -0.00341 -3.8 174052 
Jordanian Europe_EN Yoruba -0.01283 -26.7 423649 
Kumyk Anatolia_N Chukchi -0.01025 -19.6 396439 
Lebanese Anatolia_N Yoruba -0.01022 -19.5 414854 
Lebanese_Christian Anatolia_N Sindhi -0.00504 -15.7 404858 
Lebanese_Muslim Anatolia_N Brahmin_Tiwari -0.00616 -20.4 415129 
Lezgin Steppe_EMBA Jew_Yemenite -0.00481 -13.1 398974 
Lithuanian WHG Abkhasian -0.00999 -17.7 386718 
Maltese Anatolia_N Brahmin_Tiwari -0.00518 -14.5 404438 
Mordovian WHG Iranian -0.00912 -18.4 395230 
North_Ossetian Anatolia_N Chukchi -0.00894 -17.2 401729 
Norwegian WHG Abkhasian -0.00957 -16.5 393546 
Orcadian SHG Druze -0.00662 -15.8 379656 
Palestinian Europe_EN Yoruba -0.01129 -31.3 464066 
Polish SHG Druze -0.00924 -27.8 394654 
Romanian Europe_EN Steppe_EMBA -0.00549 -16.9 397119 
Russian SHG Turkish -0.00731 -25.0 398393 
Sardinian Anatolia_N Switzerland_HG -0.00587 -9.6 417931 
Saudi Anatolia_N Dinka -0.00326 -5.1 404923 
Scottish Steppe_EMBA Sardinian -0.00622 -26.6 426660 
Shetlandic WHG Abkhasian -0.00868 -14.6 386562 
Sicilian Anatolia_N Brahmin_Tiwari -0.00646 -22.2 411481 
Sorb SHG Palestinian -0.00787 -16.8 366924 
Spanish Steppe_EMBA Sardinian -0.00557 -32.2 447735 
Spanish_North WHG Armenian -0.00825 -10.9 356832 
Syrian Europe_EN Dinka -0.01002 -17.3 410920 
Turkish Europe_EN Sindhi -0.00709 -41.1 448975 
Ukrainian WHG Abkhasian -0.01183 -21.4 388282 


We show the lowest Z-score of the statistic fa(Test; Reference1, Refrencez) for Test populations with at least 2 individuals and every pair (Reference, Refrencez) of ancient or present-day source 
populations. Z-scores lower than —3 are highlighted and indicate that the Test population is admixed from sources related to (but not identical to) the reference populations. Z-scores greater than —3 
are consistent with the population either being admixed or not. 
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Uncovering Earth’s virome 


David Paez-Espino!, Emiley A. Eloe = Fadrosh!, Georgios A. Pavlopoulos!, Alex D. Thomas!, Marcel Huntemann!, 
Natalia Mikhailova!, Edward Rubin!?, Natalia N. Ivanova! & Nikos C. Kyrpides! 


Viruses are the most abundant biological entities on Earth, but challenges in detecting, isolating, and classifying unknown 
viruses have prevented exhaustive surveys of the global virome. Here we analysed over 5 Tb of metagenomic sequence data 
from 3,042 geographically diverse samples to assess the global distribution, phylogenetic diversity, and host specificity of 
viruses. We discovered over 125,000 partial DNA viral genomes, including the largest phage yet identified, and increased 
the number of known viral genes by 16-fold. Half of the predicted partial viral genomes were clustered into genetically 
distinct groups, most of which included genes unrelated to those in known viruses. Using CRISPR spacers and transfer 
RNA matches to link viral groups to microbial host(s), we doubled the number of microbial phyla known to be infected 
by viruses, and identified viruses that can infect organisms from different phyla. Analysis of viral distribution across 
diverse ecosystems revealed strong habitat-type specificity for the vast majority of viruses, but also identified some 
cosmopolitan groups. Our results highlight an extensive global viral diversity and provide detailed insight into viral 


habitat distribution and host -virus interactions. 


Viruses are the most abundant entities across all habitats, and a major 
reservoir of genetic diversity! affecting biogeochemical cycles and 
ecosystem dynamics‘. Exploration of viral populations in oceans of 
the world and within the human microbiome has illuminated consid- 
erable genetic complexity”; however, there are significant gaps in the 
global virome catalogue. There are an estimated 10°! viral particles 
infecting microbial populations‘; yet fewer than 2,200 genomes from 
double-stranded DNA (dsDNA) viruses and retroviruses are deposited 
in NCBI, compared to over 45,000 bacterial genomes”. Culture- 
independent approaches have provided a broader view of the diversity 
and distribution of dsDNA viruses®. However, their accurate detection 
and quantification using targeted sequencing remains challenging 
owing to the lack of universally conserved genomic signatures and 
complex experimental protocols’. 

Beyond gaps in characterized diversity, the scope of host-viral 
interactions is poorly understood, although it has been hypothesized 
that all cellular organisms are prey to viral attack®. Methods for studying 
host-viral interactions rely almost exclusively on cultured virus—host 
systems; however, recent in silico approaches have revealed that a much 
broader range of hosts is susceptible to viral infections®’®. Given the 
role that viruses play in host metabolism reprogramming, gene flow, 
and structuring of microbial communities, it is critical to capture viral 
linkages with their hosts. 

Currently, a plethora of metagenomic data exists that present a 
unique opportunity for viral sequence discovery'!. Although most of 
these data sets were generated by untargeted approaches without viral 
particle enrichment, they contain a wealth of viral sequences. Here, 
we developed a computational approach to explore the viral content 
of more than 3,000 metagenomic samples. We uncovered 2.1 Gb of 
viral sequence data, which increases the known viral sequence space by 
an order of magnitude, enables the prediction of previously unknown 
host-viral interactions and provides a global view of viral biogeography. 


Global expansion of viral sequence space 

In the absence of universally conserved markers, previous studies 
attempted to identify viruses using proteins present exclusively in 
viruses!*, To overcome the limitations of a biased collection of isolate 
viruses (iVGs), we complemented the viral protein families of the 


iVGs (derived from dsDNA viruses and retroviruses in the NCBI 
database) with a set of viral protein families from 1,800 manually 
identified metagenomic viral contigs (mVCs). This set was used as a 
bait to identify putative viral sequences in a large collection of assem- 
bled metagenomic contigs longer than 5 kb (Methods; Extended Data 
Figs 1-3; Supplementary Tables 1-7). These contigs were obtained 
from 3,042 metagenomes in the Integrated Microbial Genomes with 
Microbiome Samples (IMG/M) system!! (Supplementary Table 8), rep- 
resenting a collection of geographically and ecologically diverse sam- 
ples according to metadata from the Genomes OnLine Database*’’. 
This led to the identification of 125,842 putative DNA metagenomic 
viral contigs, increasing the viral sequence size in base pairs by 17.3- 
fold and the number of viral genes by 16.6-fold (Fig. 1a; Methods). 
These encode more than 2.79 million proteins, 75% of which have no 
sequence similarity to proteins from known isolate viruses, consistent 
with previous studies!*!*, Sequence similarity clustering of proteins 
encoded by the mVCs resulted in a total of 418,541 clusters with 2 or 
more members and 765,991 singletons (Methods; Supplementary Table 
9). Benchmarking was performed to validate our computational pipe- 
line, and indicated that 70% of the sequences identified in this study 
would have been missed by other methods (Methods; Extended Data 
Fig. 3; Supplementary Tables 2-6). 

To evaluate the coverage of the viral protein space by the newly 
identified sequences, we estimated the rate of accumulation of protein 
clusters as a function of the number of samples (Fig. 1b). In agreement 
with recent reports”), the curves of cluster accumulation in the two 
most heavily sampled habitats, human-associated and marine, appear 
to reach saturation. However, the rate of cluster discovery does not 
plateau when all samples are considered, suggesting that the global viral 
sequence space is largely uncharacterized. 

To compare the coverage of mVCs and iVGs by viral protein families, 
we calculated the percentage of genes with hits to viral protein families 
relative to the total number of genes on each contig (Fig. 1c). On 
the basis of this percentage, viral contigs were classified into three 
categories: those with at least 70% of genes in viral families (highly 
covered with strong similarity to viruses in the training set); those 
with 35-70% of genes in viral families; and those with less than 35% 
of genes in viral families (low covered with low similarity to viruses 
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Figure 1 | Identification of metagenomic viral sequences and habitat 
distribution. a, Number of metagenomic viral contigs compared to isolate 
viral genomes. b, Accumulation curves showing the protein cluster growth 
with increased sampling. Green, blue, and red represent clusters from 

all, aquatic, or human metagenomes, respectively. The ranges represent 


in the training set) (Extended Data Fig. 4a). The highly covered cate- 
gory included 67% of isolate viruses, but only 24.5% of mVCs (Fig. 1c), 
the majority of them from marine and human-associated habitats, 
where more reference viruses were available (Fig. 1d). Another 24.2% 
of mVCs placed in the low-covered category were typically found in 
soil, plant-associated, and engineered samples (Fig. 1d). The differences 
were even more pronounced when the data were normalized by total 
sequence length per habitat (Methods), suggesting the need for more 
extensive sampling of these environments. 

The length of mVCs ranges from 5 kb to nearly 600 kb (average 
16,625 + 18,057 bp) (Fig. 1c). On the basis of the end overlaps, 999 of 
mVCs were probably circular, representing complete viral genomes 
(Supplementary Table 10). The average size of the circular mVCs 
(53,644 + 45,677 bp) is consistent with the calculated average length 
of isolate dsDNA viruses (44,296 + 83,777 bp) (Supplementary 
Information). Among circular contigs, we identified the largest phage 
recovered to date, a 596 kb contig from a bioreactor sample’, with many 
signature genes of tailed viruses, but no recognizable housekeeping 
genes of bacteria or plasmids (Methods; Extended Data Fig. 4b; 
Supplementary Table 11; Supplementary Information). We identified 
six more mVCs ranging from 350 to 470 kb, probably representing 
fragments of other large phage genomes (Supplementary Table 12). 
As the sizes of viral particles and viral genomes are correlated'’, these 
mVCs found in many ecological niches point to the hidden diversity 
and abundance of very large phages, probably avoiding detection by 
the conventional enrichment methods. 


Sequence grouping to gauge viral diversity 

To quantify the amount of taxonomic diversity, mVCs and isolate viral 
genomes were clustered into quasi-species groups on the basis of the 
average amino acid identity'* (AAI) of all proteins and single-linkage 
clustering, using an approach analogous to the whole-genome-based 
classification scheme developed for prokaryotes!? (Methods; Extended 
Data Fig. 5a; Supplementary Table 13). 64,160 mVCs and 2,536 isolate 
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contigs were clustered into 18,470 viral groups, ranging from 2 to 
365 members per group. Most groups (57%) had only 2 members and 
only 3.7% had more than 10 members (Extended Data Fig. 5b). Similar 
to previous studies'”!*”°, the vast majority of viral groups (95.9%) did 
not contain isolate viruses. 

218 viral groups and 842 singletons contained at least one iVG 
with genus- and species-level taxonomic assignment according to 
the International Committee on Taxonomy of Viruses (Methods; 
Supplementary Tables 14, 15). Our method recapitulates current 
species-level groupings in 87% of the cases with the remainder grouping 
at genus level (Supplementary Table 14; Extended Data Fig. 5c-e). 
We compared our method with sequence-based classification used in 
previous studies, which applied protein cluster occurrence to generate 
mostly genus-level groups’. In agreement with an assessment that 
our groups represent quasi-species, our approach resulted in smaller 
clusters and more singletons (Supplementary Information). Next we 
proceeded to predict host specificity and determine environmental 
distribution of these species-level viral groups and singletons. 


Host-virus connectivity revealed 

We used a suite of computational methods to identify putative host- 
virus connections. First, we projected the isolate viral—host information 
onto a group, resulting in host assignments for 2.4% of viral groups 
(Fig. 2a). Then we used the CRISPR-Cas prokaryotic immune 
system, which holds a ‘library’ of genome fragments from phages 
(proto-spacers) that have previously infected the host”. These fragments 
retained by the host in the form of spacers can be matched to phage 
genomes linking phages with their hosts?*-*°. We amassed a database of 
3.5 million spacers from prokaryotic isolate genomes and metagenomes 
in IMG (Supplementary Tables 16, 17). As a control, 98.5% of spacer 
matches against isolate viral genomes agreed with their known 
host specificity at genus or species level (Methods; Supplementary 
Information; Supplementary Table 18). Spacers from isolate microbial 
genomes with matches to mVCs were identified for 4.4% of the viral 
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Figure 2 | Host-virus connectivity. a, Total number of host assignments 
to metagenomic viruses with three approaches. Total assignments for viral 
groups, singletons, and metagenomic viral contigs are shown above each 
bar. b, Phylogenetic distribution of bacterial and archaeal hosts. For each 
phylum a pie chart indicates the fraction of sequences assigned to this 


groups and 1.7% of singletons (Supplementary Table 19). Finally, we 
explored the hypothesis that viral transfer RNA (tRNA) genes origi- 
nate from their host”®. Using stringent sequence identity cutoffs, viral 
tRNAs identified in 7.6% of the mVCs were matched to isolate genomes 
from a single species or genus (Methods; Supplementary Information; 
Supplementary Tables 20-22). The specificity of tRNA-based host-viral 
assignment was confirmed by CRISPR-Cas spacer matches showing a 
94% agreement at the genus level (Supplementary Table 19). 

Overall, these approaches identified 9,992 putative host-virus 
associations enabling host assignment to 7.7% of mVCs. The majority 
of these connections were previously unknown, and include hosts 
from 16 prokaryotic phyla for which no viruses have previously been 
identified (Supplementary Table 23), such as the first instance of viruses 
infecting the candidate phylum SR1 (Fig. 2b). We also connected mVCs 
to pathogenic species for which no viral connections were known, 
including Fusobacterium and Leptotrichia that cause oral and skin 
infections in mammals (Supplementary Information; Supplementary 
Table 24). The discovery of phages infecting these and other pathogens 
could be exploited for phage therapy applications”””*. 

It is widely assumed that most viruses specialize in infecting related 
hosts, as broad host range is negatively correlated with infection 
success””. However, this may be an artefact?’, and viral generalists that 


phylum from metagenomic viral contigs (red), and isolate viruses (grey). 
The number of metagenomic viral contigs assigned to each phylum is 
indicated by the numbers next to pie charts. Clades in blue represent phyla 
with cultivated representatives. Clades in white represent candidate phyla 
without cultured representatives. 


infect hosts across taxonomic orders do exist*’. Our data suggested a 


trend for narrow host range with some notable exceptions. Whereas 
most CRISPR spacer matches were from viral sequences to hosts within 
one species or genus (Fig. 3a), some mVCs were linked to multiple 
hosts from higher taxa, including different phyla. A viral group 
comprised of mVCs from human oral samples contained three distinct 
proto-spacers with nearly exact matches to spacers in Actinobacteria 
and Firmicutes (Fig. 3b). In another case (Fig. 3c), proto-spacers from 
two mVCs derived from faecal samples were linked to spacers in three 
distinct Clostridiales families (Extended Data Fig. 6; Supplementary 
Information). As viruses exploit the host transcription/translation 
machinery, the existence of viruses with a surprisingly broad range 
of hosts opens opportunities for identification of novel enzymes or 
regulatory sequences, with biotechnological applications. 


Biogeographic patterns of viral diversity 

Previous studies of viral biogeography mainly focused on single 
habitats?*!32, and only a handful of small-scale studies explored 
viromes across environments**~*. As our pipeline was designed to 
identify longer viral contigs that probably represent more abundant 
populations, we explored the dispersal of the predicted viruses 
by aligning the contigs against all assembled and unassembled 
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metagenome sequences (Methods; Supplementary Information). 
This approach revealed that 86% of the viral sequences were found in 
more than one sample, whereas 73% were present in at least 5 samples 
(Extended Data Fig. 7a), mostly from relatively well-sampled marine 
and human-associated habitats. This enabled a detailed investigation of 
viral distribution patterns across these environments (Fig. 4). 

The distribution of viral sequences in marine samples is charac- 
terized by distinct spatial patterns based on water column depth 
and distance from the shore (Fig. 4a). Although viral assemblages in 
coastal waters from distant biogeographical provinces are markedly 
different, bathypelagic samples from different oceanic basins display 
very similar viral profiles, in agreement with observations that 
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Figure 4 | Viral distribution patterns in marine and human samples. 
a, b, Hierarchical clustering of viral groups and singletons across marine 
(a) and human samples (b). Data sets were grouped according 

to environmental metadata or body sub-site, respectively. Oceanic zones 
(a) include Estuary (E), Coastal waters (CW), Coastal sediments (CS), 
Oceanic water photic (OWP; surface to 200m depth), Oceanic twilight 
(OT; 200m to 750m depth), Oceanic deep ocean (ODO; below 750m), 
Oceanic sediment (OS), and Hydrothermal vents (HV). Virus coverage 
is colour-coded from white (lowest coverage) to red (highest coverage) 
as shown in b inset. Blue represents absence of the corresponding viral 
sequence. 
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family), Eubacterium rectale ATCC 33656 
(Eubacteriaceae family), and Ruminococcus 

sp. SR 1/5 (Ruminococcaceae family) (details in 
Supplementary Information). 


deep ocean phylum-level composition of microbial communities is 
relatively uniform**. Although viral sequences are mostly partitioned 
into zone-specific groups, some are present in diverse samples across 
zones and oceanic provinces, including one viral group found in 95% 
of all twilight samples and in 44% of deep ocean samples (Extended 
Data Fig. 7b-c). 

The distribution of viral sequences in human microbiome samples 
(Fig. 4b) also shows clear body-site specificity with only a few 
viral groups and singletons found in both faecal and oral samples 
(Supplementary Tables 25, 26). In contrast to previous studies**’~*?, 
many viral sequences, mostly of phage origin, were shared between 
samples from the same body site of unrelated individuals. More than 
30% of intestinal and 50% of oral viral sequences were shared by at least 
10% of sampled subjects (Extended Data Fig. 7d, e). Approximately 
0.5% of sequences in both body sites were shared by more than 80% of 
sampled individuals, whereas 17% and 9% of intestinal and oral viral 
sequences, respectively, were unique to each individual. We used raw 
sequencing reads to estimate the amount of viral sequences in 550 
faecal and oral samples. Viral fraction varied from 0.2 to 54% of the 
total amount of high quality sequence in the sample, with the average 
of 3.4% in oral samples and 7.4% in stool samples, which is higher than 
previously reported 2.5 to 3.5% in stool” (Supplementary Information). 

Although 84% of our quasi-species viral groups found in multiple 
samples resided within a single habitat type (Fig. 5a), 14% were found 
in two habitat types, typically, within the same broader environmental 
category (Fig. 5b, c), and a small number of groups were spanning 
two or more environmental categories (Fig. 5c, d; Supplementary 
Information; Supplementary Table 27). Most of these were due to 
uncertainty of habitat classification (for example, plant rhizosphere 
samples classified as host-associated) (Fig. 5c). A more detailed 
analysis of the most ubiquitous viral sequences revealed that they 
are probably human and laboratory contaminants, including ®X and 
phages used as vectors, sequencing and molecular weight standards”, 
and Propionibacterium acnes phages, common inhabitants of human 
skin. Several viruses recovered in a wide variety of environments were 
found to be prophages with broad host specificity (Supplementary 
Information; Supplementary Table 28) infecting hosts with different 
habitat preferences. Some of these prophages were found to carry a 
variety of cargo genes, presumably conferring competitive advantage 
to their hosts and explaining their broad distribution*! (Extended Data 
Figs 8, 9). However, in a few cases, the presence of viral groups in diverse 
environments could not be attributed to metadata discrepancies, ambi- 
guity of habitat classification, contamination or broad host specificity. 
A small number of viral groups was found in aquatic samples with large 
differences in salinity, such as freshwater and hypersaline lakes, whereas 
other groups were found in oil-contaminated wastewater, and in human 
oral and faecal samples (Fig. 5c, d; Supplementary Information). Our 
observations of a limited number of ubiquitous viruses expand on 
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previous studies**” 


‘cosmopolitanism: 

To generate a global map of viral dispersal, we linked the viral 
sequences with geographic coordinates of the corresponding samples. 
Many viruses were found in similar ecological niches across large 
geographic distances, with the most prominent connectivity within 
extensively sampled marine biome (Fig. 6a; Supplementary Table 8), 
which is in agreement with previous studies, suggesting that viruses are 
passively transported along oceanic currents*. We also observed sparse 
but non-negligible connections between non-marine viral groups and 
singletons in samples of the same habitat type across biomes (Fig. 6b) 
and across different ecosystems (Extended Data Fig. 10). 


that shed light on the mechanisms underlying their 


Discussion 

This study shows that in-depth exploration of ecosystems by untargeted 
metagenome sequencing is a powerful approach to fill knowledge gaps 
and address fundamental questions of viral ecology. Our analysis led 
to a notable increase in the number of viral sequences and putative 
virus—host connections, demonstrating that a much larger prokaryotic 
diversity than previously known is preyed upon by viruses, expanding 
on a recent report of microbial lineages containing prophages”’. 
Consistent with previous observations, the environmental viral 
quasi-species were mostly found to have a narrow host range with a 
few notable exceptions of phages with broad taxonomic host specificity, 
including examples of hosts from different phyla. 

The global maps of viral biogeography show that viruses are predom- 
inantly found in similar habitats, regardless their geographic proximity. 
This pattern was most prominent for marine viruses as previously 
observed’, yet was also striking across seemingly isolated locales such 
as lakes, plant-associated habitats and soils, where the dispersal mode 
is not immediately obvious. More surprising was the significant human 
virome sharing between unrelated individuals and the identification 
of viral quasi-species distributed across markedly different habitats. 

Overall, this study demonstrates the value of untargeted de novo 
metagenomic analysis as compared to reference-based and targeted 
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virome approaches, highlighting the importance of globally sampled 
metagenomic data sets to vastly improve viral sequence discovery. 
Ultimately, large-scale computational exploration of uncharted viral 
sequence space will assist in addressing the remaining mysteries of 
viral ecology. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Data availability. All the sequence data and metadata from the samples used 
in this work could be accessed through the Integrated Microbial Genomes with 
Microbiomes system IMG/M database! (https://img.jgi.doe.gov) using both 
metagenome and scaffold identifiers provided throughout the manuscript and 
the Supplementary Information. Thus, by using these identifiers in the Genome 
Search tool or Scaffold Search tool (under ‘Find Genomes tab) in the user inter- 
face, the corresponding sequences, their annotations, as well as their associated 
metadata can be retrieved. Moreover, Hidden Markov Models (HMMs) of viral 
protein families as well as the predicted DNA viral sequences in fasta format are 
available at the following public FTP site: (http://portal.nersc.gov/dna/microbial/ 
prokpubs/EarthVirome_DP/). 

Metagenomic samples used in this study. All publicly available metagenomic 
data sets from the IMG/M system (3,042 samples comprising 5 terabase pairs 
of sequences) were used for this analysis!!. The sample collection included 1,729 
environmental samples, 1,079 host-associated samples, and 234 engineered 
samples according to the Genomes OnLine sample classification’. We identified 
putative viral contigs in 1,882 out of the 3,042 metagenomic data sets. The metadata 
for these data sets including sample collection information, library construction 
and sequencing protocols, as well as assembly strategy were retrieved from GOLD 
database®. Based on GOLD metadata, the vast majority of these data sets were 
generated from dsDNA using an untargeted approach (that is, only 59 samples 
underwent viral particle enrichment, viral DNA enrichment or library construction 
with sequencing protocols optimized for the recovery of viral sequences). All of 
the data sets have been annotated by the IMG metagenome annotation pipeline“, 
which performs gene prediction and functional annotation through assignment 
of predicted proteins to protein families, such as Pfam*° and KEGG Orthology 
(KO) clusters*®. Some of the data sets included both assembled and unassem- 
bled data, while others had only assembled sequences (Supplementary Table 8). 
An assembly pipeline used for each data set is described in GOLD. In addition, 
the contiguity of assembled sequences varied greatly from sample to sample. The 
ecosystem subcategories here used were manually curated according to sample 
metadata establishing 10 distinct habitat types: marine, freshwater, non-marine- 
saline and alkaline, thermal springs, terrestrial soil, terrestrial others (including 
mostly deep subsurface samples), host-associated human, host-associated plants, 
host-associated others (including host animal-associated other than human), and 
engineered (for example, bioreactor) (Supplementary Table 8). Only contigs longer 
than 5 kb (59.5 Gb from 5.1 million contigs) were primarily included in this study. 
Normalization factors. We normalized the data sets by the size of the sample 
(measured as total number of bp from sequences larger than 5 kb) per habitat 
type. The normalization factor used in each habitat type was: marine, 29,602 Mb; 
freshwater, 96,314 Mb; non-marine saline and alkaline, 2,825 Mb; thermal 
springs, 1,828 Mb; terrestrial (soil), 5,794 Mb; terrestrial (other), 1,659 Mb; host- 
associated (human), 10,349 Mb; host-associated (plants), 3,909 Mb; host-associated 
(others), 23,452 Mb; engineered, 10,486 Mb. 

Isolate reference viruses (iVGs). We used a combination of 2,353 iVGs composed 
of all isolate dsDNA viruses and retroviruses from the NCBI server (http://www. 
ncbi.nlm.nih.gov/genome/viruses/, data accessed on 04/2015) to extract all viral 
proteins and to establish, after filtering, the first round of viral protein families. 
Additionally, we used a list of 5,042 reference viruses (Supplementary Table 13) 
extracted from the IMG/M system (that included all RNA and DNA eukaryotic and 
prokaryotic referenced viral genomes) to generate and validate our viral genome 
clustering method and also to calculate the average length of all reference viruses 
as 44,296 bp + 83,777 bp s.d. (Supplementary Table 13). 

Generation of viral protein families. 167,042 protein coding genes were collected 
from 2,353 iVGs (dsDNA viruses and retroviruses combined) from the NCBI 
server (http://www.ncbi.nlm.nih.gov/genomes/GenomesGroup.cgi?taxid= 
10239#). After dereplication using 70% identity in usearch*’, 98,000 protein 
sequences were obtained from which 83,500 were clustered into 15,900 groups 
using the Markov Cluster (MCL) algorithm**. Proteins within clusters were 
aligned using MAFFT” and a set of 14,296 viral protein families was created using 
hmmbuild*’. After manual curation of the viral families with high representation in 
prokaryotic genomes, viral protein families were compared against the 5.1 million 
metagenomic contigs longer than 5 kb. 62,000 contigs with 5 or more viral protein 
families were collected, and these were reduced to 9,000 putative viral contigs 
after removing contigs below 50 kb. An additional filtering step was performed 
to exclude contigs with a high number of Kegg Orthology (KO) terms and Pfams 
(10% and 25% respectively); this reduced the number of putative viral contigs 
to 1,589. These were complemented with 66 and 188 sequences derived from 
diverse metagenomic contigs longer than 20 kb that were binned with viruses or 
contained a viral RNA polymerase gene, respectively, and were not captured using 
the previous filter of bearing 5 or more viral protein families (detailed in section 
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below; Extended Data Fig. 2; Supplementary Table 1). A total of 1,843 mVCs 
encoding 191,000 proteins were used to complement the original set of 167,042 
proteins derived from iVGs. Repeating the steps described above (that is, usearch 
70% for de-replication, MCL clustering, MAFFT alignment and hmmbuild with a 
filter for viral families abundant in prokaryotes) the final list of 25,000 viral protein 
families was obtained and used for further exploration. 

Identification of metagenomic viral contigs for a training set via manual 
curation, binning and DNA-dependent RNA polymerase alignment. To 
expand the training set of viral sequences, metagenome contigs identified as high- 
confidence viral sequences in the first iteration of our pipeline (Extended Data 
Fig. 1) were complemented with additional metagenome contigs and scaffolds, 
not captured using viral protein families generated from isolate viruses. The first 
approach used kmer-based binning of 6 metagenome samples that contained 
the highest number of candidate viral sequences, which were not satisfy high- 
confidence threshold due to insufficient number of hits to protein models. These 
data sets were binned by Emergent Self Organizing Maps (ESOM; by Ultsch) as 
described previously”! and contig sets outside the bins corresponding to cellular 
organisms were manually checked (Extended Data Fig. 2a). K-mer-based binning 
identified 66 putative novel mVCs from diverse habitat types (freshwater, waste- 
water, thermal vents, and marine with IMG sample identifiers 3300000553, 
3300001592, 3300001681, 3300000116, and 3300001450, respectively). 

The second approach relied on identification of contigs containing RNA pol- 
ymerase with domain composition reminiscent of RNA polymerase (RNAp) 
found in cellular life forms, which could not be placed into one of three domains 
on the tree of life based on their sequence similarity. First, 2,551 representative 
sequences of the genes encoding the three major subunits (a, }, 3’) of the RNAp 
gene from bacteria, as well as their eukaryotic and archaeal counterparts, were 
collected from IMG database. Next, the domains of these genes were extracted 
using Pfam models and aligned with MAFFT”. Alignments were manually 
inspected and HMM models were built using hmmbuild*’. These models were 
used to scan metagenomic sequences longer than 5 kb and identified 39,109 contigs 
with matches for at least one core RNAp subunit. After filtering short matches anda 
dereplication step, we obtained 7,437 metagenomic sequences that were combined 
with 2,551 reference isolates to build a tree with 9,309 RNAp sequences using 
FastTree™ with default parameters (Extended Data Fig. 2d). The tree was visualized 
using Dendroscope*’ and RNAp branch corresponding to large eukaryotic DNA 
viruses was identified on the basis of reference sequences from isolate genomes. 
In addition to eukaryotic viruses, another set of metagenomic RNAp sequences 
branching separately from cellular references, turned out to comprise phage RNAp 
with domain composition similar to bacterial enzyme (detailed in Extended Data 
Fig. 2e). A total of 188 contigs longer than 20 kb containing viral and phage RNAp 
sequences were added to the training set. 

Assignment of metagenomic sequences to viruses. The 25,000 viral protein 
families were used to identify 125,842 DNA metagenomic viral contigs (mVCs) 
longer than 5kb using 3 distinct filters. First, mVCs were identified from metagen- 
omic contigs that had at least 5 hits to viral protein families, the total number of genes 
covered with KO terms on the contig was <20%; the total number of genes covered 
with Pfams <40%; and the number of genes covered with viral protein families 
>10%. Second, metagenomic sequences were selected as mVCs when the number 
of viral protein families on the contig were equal or higher than the number of 
Pfams. Finally, metagenomic contigs for which the number of viral protein families 
was equal or higher than 60% of the total of the genes were also assigned to mVCs. 
Benchmarking and modelling of this DNA viral discovery computational approach 
are detailed below, demonstrating a specificity of 99.6% for viral detection with a 
37.5% recall rate (sensitivity to identify all viral sequences). 

Benchmarking of computational approaches for virus detection. In order to 
assess the accuracy of our DNA vHMM virus detection pipeline, we generated a 
synthetic metagenome, consisting of finished genomes of 32 bacteria, 3 archaea 
and 5 viruses (Supplementary Table 2), which included 88 replicons. Bacterial 
genomes include representatives of 4 phyla. A total of 132 prophage sequences 
were identified including 99 prophages identified by CyVerse™ implementation of 
VirSorter?? in the categories 1, 2, 4, and 5, and 33 prophages identified by manual 
curation based on the presence of hallmark phage genes and analysis of synteny 
with closely related strains. Coordinates of 35 prophages predicted by VirSorter 
had to be manually adjusted to eliminate bacterial genes (including ribosomal 
RNAs and other housekeeping genes) and to separate 2 prophage sequences called 
as one prophage over an intervening stretch of bacterial genes. Coordinates of 
the prophages are provided in Supplementary Table 3. None of the viruses or 
prophages used in the synthetic metagenome were included in the training set 
used to generate viral HMMs for our pipeline. 

The genome sequences were fragmented to generate 63,222 contigs of length 
5kb to 60kb. The distribution of fragmented contigs include 28,497 5-kb-long 
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fragments, 14,228 10-kb-long fragments, 7,096 20-kb-long fragments, 4,723 
30-kb-long fragments, 3,525 40-kb-long fragments, 2,810 50-kb-long fragments 
and 2,343 60-kb-long fragments. The resulting synthetic metagenome is dominated 
by bacterial and archaeal chromosomal fragments with an admixture of a relatively 
small number of plasmid and viral sequences, which is a faithful representation 
of a typical metagenome data set generated by an untargeted approach rather 
than by targeted virome sequencing approach. The metagenome was submitted 
to a CyVerse implementation of VirSorter and also processed by our VHMM 
pipeline. Only the categories 1, 2, 4 and 5 of Virsorter predictions were considered, 
as manual inspection showed that categories 3 and 6 contained mostly false 
positives. Sequence fragments with at least 3 kb of phage or prophage sequence 
were considered as true positive viral sequences; those with less than 3 kb of phage 
or prophage sequence were considered true negative. 

Calculating the rate of viral protein cluster accumulation and the number of 
proteins with high similarity to proteins encoded by isolate viruses. The 125,842 
metagenomic viral contigs longer than 5kb encoded a total of 2.79 million proteins. 
BLASTp® with an e-value of 1.0 x 10~° was used and 1 hit per query protein with 
>60% sequence identity and >80% alignment on the shorter sequence. Proteins 
encoded by mVCs were clustered using CD-HIT*” at 60% sequence identity and 
>80% alignment on the shorter sequence. For each sample count, 100 random 
metagenome sets were generated and the total number of protein clusters found 
on the contigs from this set was calculated. This analysis was repeated separately 
for metagenome samples classified as ‘aquatic’ (n = 656) and ‘humam (n= 673). 
Comparison of mVCs protein clusters against all iVGs. Sequence similarity 
of mVCs to iVGs was computed using BLASTp** with an e-value threshold 
of 1.0 x 10-° and alignment length of at least 80% of the shorter protein. No 
percentage identity or bit-score cutoffs were applied (Supplementary Table 9). 
Identification of complete metagenomic viral genomes. To assess the number 
of closed DNA mVCs, we searched for overlapping sequences in the 3’ and 5’ 
region of all the 125,842 metagenomic contigs. Extractseq** was used to trim the 
first 100 bp of each contig and BLAT*? was used to search each 100-bp fragment 
against the respective contig. Only exact overlapping matches for both the 3/ and 5’ 
regions were considered. This resulted in the identification of 999 putatively closed 
mVCs, ranging from 5,037 bp to 630,638 bp in length (average, 53,644 bp + 45,677 
bp s.d.). Supplementary Table 10 lists all putatively closed mVCs. 

Viral genome clustering and designation of viral groups. A sequence-based 
classification framework was developed for systematically linking closely related 
viral genomes based on their overall protein similarity. The framework relies on 
both AAI and total alignment fraction (AF) for pairwise comparisons of viral 
sequences, and enables natural grouping of related iVGs and mVCs. The 125,842 
mVCs were combined with all iVGs (DNA and RNA viruses) for the generation of 
the viral group classification framework (Supplementary Information). To reduce 
the number of the AAI comparisons, only mVCs that contained at least one pro- 
tein match with >70% identity across >50% of the shortest protein length were 
selected for pairwise computations. This filter reduced the number of total pairwise 
comparisons from 9.5 billion to 15.9 million. The bidirectional average amino acid 
identity (AAI) was performed as previously described’* for all of the 15.9 million 
pairwise comparisons. This method implements usearch’’ for rapid blast, and 
selects the bidirectional best hit for each protein encoded on the mVC and outputs 
the AAI and the AF The output was subsequently filtered to include only matches 
that had >90% AAI and >50% AF which were the observed parameters that best 
reproduced the existing taxonomy of iVGs (Supplementary Information; Extended 
Data Fig. 5a). The high-quality filtered AAI results were then clustered using 
single-linkage hierarchical clustering and visualized in Cytoscape™ (Extended 
Data Fig. 5c-e). 

Validation of viral groups generated. As a validation of our clustering method we 
observed that 87% of the iVGs (920 out of the 1,060 viral groups or singletons) with 
a taxonomic assignment according to the International Committee on Taxonomy 
of Viruses (ICTV) clustered in agreement with the ICT V-designated species. All 
the remaining 13% of iVGs clustered at the genus-level. From this 13% (represented 
by 140 viral groups that contain at least one iVG) we found that only 49 were phage 
groups, with high pairwise (over 90% AAI) values for the reference viruses within 
each group, suggesting that despite their taxonomic assignments, they were also 
probably members of the same species (Supplementary Table 14). These analyses 
show that our viral groups are taxonomically relevant and provide a useful method 
for organizing distinct viral types. 

Viral host assignment using the CRISPR-Cas system. A CRISPR-Cas spacer 
database of 3.5 million sequences was created using a modified version of the 
CRISPR Recognition Tool® (CRT) detailed in ref. 44 against 40,623 isolates and 
6,714 metagenomes (all data sets from the IMG system as of 9 July 2015). All 
identified spacers were queried for exact sequence matches against all iVGs using 
the BLASTn-short function from the BLAST+ package with parameters: e-value 


threshold of 1.0 x 107!°, percentage identity of 95%, and using 1 as a maximum 
target sequence”. 98.5% of the detected 1,340 spacer hits were to a putative 
bacterial or archaeal host whose taxonomic assignment was in agreement at the 
species or genus level with the existing viral taxonomy (Supplementary Table 18). 
From the remaining matches, 1.2% of the hits agreed at the family level and only 
0.3% of the spacers (2 cases where Pseudomonas spacers matched a Rhodothermus 
phage, and Methylomicrobium spacers that matched Pseudomonas and Burkholderia 
phage) were above family, validating our approach of host assignment based on 
CRISPR-Cas spacer matches. Subsequently, all 3.5 million spacers were compared 
against the 125,842 mVCs, requiring at least 95% identity over the whole spacer 
length, and allowing only 1-2 SNPs at the 5’ end of the sequence. A total of 12,576 
proto-spacers (that is, spacer sequences within a phage genome) were identified. 
Based on CRISPR-Cas spacer matches exclusively from microbial isolate genomes 
we assigned host taxonomy to 8,084 mVCs (representing 6.42% of all the mVCs), 
comprising 826 viral groups (~4.47% of the total) plus 1,100 viral singletons 
(~1.71%) (Fig. 2a; Supplementary Table 19). 

Host-virus assignment using viral tRNA matches. Identification of tRNAs from 
mVCs was performed with ARAGORN v1.2 (ref. 62) using the “-t’ option. In order to 
validate this approach, 2,181 tRNA sequences were recovered from 344 referenced 
viruses (~7% of the total). These were compared against all genomes and 
metagenomes in the IMG system using BLAST, leading to 16,089 perfect hits (100% 
length and 100% sequence identity) after removing self-hits and duplicates. The 
taxonomic assignment of the tRNAs found in iVGs was compared against the 
taxonomic information of the isolate microbial genomes showing that 92.5% of 
the matches agreed at the genus or species level (Supplementary Table 18). After 
culling the top-20 most abundant viral-tRNA sequences (sequences conserved 
across members of the gammaproteobacteria class; Supplementary Table 22) and 
repeating the above steps with mVCs, 32,449 tRNAs within 9,555 mVCs (7.6% out 
of the 125,842 total) were identified, enabling the host assignment for 2,527 mVCs 
(Supplementary Information; Supplementary Table 19). 

Low abundance virus detection. In order to detect the presence of any of the 
mVCs in lower abundances across different habitat types, we expanded our analysis 
to include not only assembled data (that probably represent the most abundant 
viruses) but also unassembled data from 4,169 samples currently available 
in IMG/M database, which comprises more than 5 Tb of sequences. We used 
BLASTn program in the Blast+ package” to find hits to our 125,842 predicted 
viral sequences with an e-value cutoff of 1 x 10~*°, at least 90% identity, and the 
hits from the sample covering at least 10% of the length of the viral contig. This 
filtering of BLAST results excluded matches to short highly conserved fragments 
of viral sequences, such as tRNAs, and other spurious hits. Our filtering crite- 
ria were optimized for the type of metagenome data sets available to us, and are 
significantly more stringent than those used in some previous studies for similar 
data (e.g. 95% identity over 75 nt alignment used in ref. 63) or tBLASTx with 
e-value of 1.0 x 107° recommended by ref. 64. However, it was less stringent than 
the 75% coverage used in the analysis of Tara Oceans Viromes’, which relied on 
viral enrichment to increase viral sequence coverage. For the largest metagenome 
available to us (IMG taxon 3300002568, Grasslands soil microbial communities 
from Hopland, California, USA), this new analysis was able to detect 500 nt of viral 
sequence in 138,769,704,035 nt of total metagenome sequence, which corresponds 
to the abundance of 3.06 x 10° °7%. 

Habitat type specificity of predicted viral sequences based on their BLASTn hits 
in assembled and unassembled data shows their presence even at low abundance, 
depending on the sequence coverage for each specific metagenome (Fig. 4a, b). 
The distribution of less abundant viruses supports the trend that viruses have a 
strong specificity for a particular habitat type since ~84% of all the viral groups are 
found exclusively in a single habitat type. About 14% of the viral groups were found 
in 2 habitat types, and most of these cases could be explained by the uncertainty 
of habitat type classification. For instance, algae-associated microbiomes were 
classified as plant host-associated and shared viral groups with marine samples, 
whereas loose soil samples classified as terrestrial habitat type shared viral groups 
with rhizosphere samples, which were classified as plant host-associated. After 
excluding ambiguously classified cases, most viral groups detected in more than 
1 habitat type were found in the samples from the same environmental category 
(for example, in different aquatic habitats or in different mammalian hosts). We 
further report the finding of ~0.2% of the viral groups in 5 or more habitats types 
and discuss the main types of these ‘cosmopolitan’ viruses (probably laboratory 
contaminants, prophages with broad-host specificity, and bona fide lytic phages 
with unexpectedly broad habitat type distribution). 

Estimation of viral abundance in faecal and oral metagenomes from Human 
Microbiome Project. Raw reads for faecal and oral metagenomes were retrieved 
from the Short Read Archive (http://www.ncbi.nlm.nih.gov/sra/) based on the 
metadata available in GOLD. The reads were quality-filtered and quality-trimmed 
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using rqcfilter tool from BBtools package (https://sourceforge.net/projects/ 
bbtools/) with default settings: kmer length for trimming of 23, minimum average 
quality of 5, trim quality threshold of 10, reads shorter than 45 nt after trimming 
were discarded. Quality-filtered and trimmed reads were digitally normalized and 
error corrected using bbnorm tool from BBtools package with default settings. 
Normalized reads were assembled using SPADES 3.6.2 (ref. 65) and kmers of 19, 
39, 59, 79, 99, selecting an optimal kmer length based on the maximal N50. Average 
contig and scaffold coverage of assembled data was calculated by mapping the 
quality-filtered and -trimmed reads to the assembly using bbmap tool from BBtools 
with default kmer length of 13 and minimum percentage identity cutoff of 95%. 
The unmapped reads were merged using bbmerge tool from BBtools package and 
the sequences shorter than 100 nt were discarded. mVCs were aligned against these 
data using BLASTn and filtered as described above. Only 1 best hit per sequence 
was retained. Coverage of each mVC by sample data was calculated as alignment 
length multiplied by the coverage of the subject sequence and summed over all 
sequences in the sample with hits to this mVC. 

Putative prophage identification. We have identified putative prophages among 
125,842 mVCs using these contigs as a query and running BLASTn® comparison of 
‘blast+’ package against all isolate genomes in the IMG database. e-value cutoff of 
1.0 x 10°? and percentage identity of 80% were used, and mVCs with cumulative 
alignment of at least 75% of mVC length against an isolate genome were considered 
prophage candidates (Supplementary Table 4). 

Global virus distribution maps. Visualization was made with the use of 
Processing programming language (https://processing.org/) and a freely available 
equirectangular projection of the world map (http://eoimages.gsfc.nasa.gov/ 
images/imagerecords/57000/57752/land_shallow_topo_2048.jpg) was used 
as a background image. Sample points are positioned by latitude and longitude 
coordinates of Biosamples obtained from GOLD*. Points are coloured based on 
a customized reclassification of the GOLD hierarchical ecosystem classification 
(habitat types). Lines between points indicate samples that share at least 2 viral 
groups or singletons. 
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Extended Data Figure 1 | Detailed workflow for the identification of 
viral sequences from metagenomic data. a, Overview of the acquisition 
and filtering of viral protein families in two rounds and their use for the 
identification of metagenomic viral contigs larger than 5 kb. In the first 
round, proteins from 2,300 double-stranded DNA viruses were grouped 
into 16,000 protein families, which were aligned to generate Hidden 
Markov Models (HMMs). These HMMs were used in combination 

with analysis of k-mer composition and phylogenetic analysis of DNA- 
dependent RNA polymerase genes to identify 1,843 high-confidence 
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metagenome viral contigs. b, c, These contigs were validated by manual 
analysis (b) and the proteins from this set were combined with the isolate 
viral proteins to generate a final set of 25,000 viral protein families (c). 

d, HMMs generated from alignment of these protein families were used to 
identify 125,842 metagenomic viral contigs. Processing steps detailed in 
b-d are described in the Methods. The final mVCs were then grouped and 
assigned to their hosts via CRISPR-Cas spacer matches and viral tRNA 
matches against isolate microbes (not shown in this figure). 
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Extended Data Figure 2 | Identification of metagenomic viral contigs 
via binning and DNA-dependent RNA polymerase alignment. 

a-c, Three distinct metagenomic examples of tetranucleotide Emergent 
Self Organizing Maps (ESOM) as a binning method for identification 

of candidate viral sequences in metagenome data sets. Tetranucleotide 
binning of metagenomic samples (full list in Supplementary Table 1) 

was used to identify highly divergent viral sequences, which were left 
undetected using viral protein families generated from isolate viruses. 
Each dot on the maps represents a 10 kb fragment of a metagenomic 
scaffold longer than 20 kb. ‘Bubbles’ (ESOM structures) correspond to 
fragments with similar tetranucleotide composition probably originating 
from the same genome. Red dots represent viral sequences detected by 
viral protein families generated for isolate viruses; white dots represent 
highly divergent viral sequences with no hits to viral protein families. 

a, ESOM of freshwater sample (combined assembly of freshwater 
microbial communities from Lake Mendota and Trout Bog Lake, IMG 
identifier 3300000553) shows 2 putative viral sequences previously 
unidentified (IMG scaffold identifiers 10001161 and 10001271). 

b, ESOM of marine sample (marine microbial communities from 
Delaware Coast, sample from Delaware MO Spring March 2010, IMG 
identifier 3300000116) shows 2 putative viral sequences sequences (IMG 
scaffold identifiers c10000689 and c10000429). c, ESOM of hydrothermal 
vent sample (black smokers hydrothermal plume microbial communities 
from Abe, Lau Basin, Pacific Ocean, IMG identifier 3300001681) showing 
2 viral sequences (IMG scaffold identifiers 10000222 and 10000095). 
Metagenome samples can be found in IMG using IMG identifiers and 


‘Quick Search’ or ‘Genome Search’ tools; metagenome scaffolds can 

be using scaffold identifier and ‘Scaffold Search’ tool on the respective 
“‘Microbiome Details’ page. d, e, DNA-dependent RNA polymerase genes 
of likely viral origin from metagenomic sequences longer than 5 kb. 

d, Hidden Markov Models (HMMs) were built for sequences 
corresponding to a, 8, and 3' subunits of bacterial DNA-dependent RNA 
polymerase for a representative set of 2,551 cellular organisms (archaea, 
bacteria, and eukaryotes) and viruses. These models were used to search 
the proteins encoded by metagenomic contigs longer than 5 kb and the 
proteins with hits were aligned against the HMMs. A total of 7,437 nearly 
full-length metagenomic sequences were combined with 2,551 reference 
sequences to reconstruct the phylogenetic tree using FastTree tool. Two 
distinct branches on this tree were separated from the sequences from 
cellular organisms and included RNA polymerase genes from eukaryotic 
viruses (green box) and putative phage sequences with domain structure 
similar to that of bacterial RNA polymerase (red box, marked with double 
asterisk). Only 122 out of the 400 contigs in the eukaryotic viral RNA 
polymerase branch were captured by isolate protein families. e, Detailed 
view of the RNA polymerase tree branch with putative phage sequences. 
Metagenome contigs detected as viral by viral protein families and by 
spacer hits are marked with a square or circle next to it. Gene structure 
for selected contigs (IMG chromosomal neighbourhood view) is shown 
in the boxes. In the examples, genes are coloured based on predicted 
function category (using Clusters of Orthologous Genes prediction) and 
are specified in the figure. White-coloured genes correspond to those with 
hypothetical or unknown function. 
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Extended Data Figure 4 | Detailed gene content of singular 
metagenomic viral contigs examples. a, Gene content of the 
metagenomic partial viral genome with the lowest gene coverage 

by viral protein families. This length of the partial viral genome is 
81,542 bp (guanine and cytosine (GC) content of 43%; 163 total genes) 
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viral genome identified to date. The length of the closed (circular) viral 
genome is 596,617 bp (GC, 40%; 1,148 total genes) and was identified 
from a bioreactor metagenome (IMG scaffold id: D1draft_1000006, 
from Bioreactor L1-648F-DHS sludge microbial communities sample). 
Predicted gene function is coloured based on Clusters of Orthologous 


and was identified from a bovine rumen metagenome (IMG scaffold 
identifier, rumenHiSeq_NODE_3763566_len_81492_cov_5_518198; 
IMG metagenome identifier, 2061766007). White-coloured genes 
correspond to those with hypothetical or unknown function. Only 3% 
of the genes were covered by VPFs. b, Gene content of the largest closed 


Genes. Black triangles indicate tRNAs sequences (a, b). A total of 11% of 
the genes were covered by VPFs. Specific viral genes distributed across the 
genome are boxed in red, identified with a number, and described in the 
legend table. The detailed information of the whole gene content of this 
viral genome is located in Supplementary Table 11. 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | Viral group clustering method. a, Parameters 
used in the clustering of viral sequences. We used all 5,042 reference isolate 
viral genomes (iVGs) to group them using single-linkage hierarchical 
clustering (SLC) with different combinations of AAI and AF values to 
validate the clustering approach. The thresholds for AAI and AF were set 
at 90% and 50%, respectively, (boxed in purple) and were selected based on 
the accurate grouping of iVGs that was in agreement at the genus level, and 
the vast majority at the species level, according to the ICTV classification 
system (Supplementary Information). Further, these thresholds reduced 
the number of total connections (green line referred to secondary y axis) 
compared with lower AAI thresholds, without altering the total number 

of singletons and viral groups created (red and light green bars referred to 
primary y axis, respectively), as well as the average number of members 
per viral group (shown at the bottom of the figure). b, Size distribution 

of viral groups. Distribution of the 66,696 viral genomes clustered into 
18,470 viral groups. Number of viral members (spanning from 2 to 365) 
per viral groups is shown. c-e, The cytoscape visualization of some viral 


groups. c, Major reference isolated viral groups created using SLC with 
AAI and AF values of 90% and 50%, respectively. Cytoscape force-directed 
(unweighted) layout option was used to visualize these groups. Black 
nodes represent isolated viral genomes whereas orange and green nodes 
represent metagenomic viral contigs clustered with isolates from host- 
associated and environmental samples, respectively. Group edges connect 
viral groups based on the above cutoffs. d, The four largest viral groups 
created from metagenomic viral contigs (containing 365, 201, 165, and 
152 members, respectively). Specific habitat information of the samples as 
well as the viral group identifier is shown in the figure. e, Examples of viral 
groups (vg_2932 and vg_2864) containing proto-spacers (indicated by 
green circles) found in the CRISPR-Cas system of the indicated bacterial 
taxon. All the metagenomic viral contigs clustered in both viral groups 
were found in the same habitat subtype: human oral samples for vg_2932, 
and human faecal samples for vg_2864 (with a sole exception in the latter 
group that derived from an oral sample, indicated with a red arrow). 
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Extended Data Figure 6 | Verification of viruses identified with broad- 
host range. a, b, Alignments of all contigs found in the IMG database 
containing any of the 3 spacer matches present in a viral group potentially 
infecting 2 different phyla or any of the 7 spacer matches present in a 
viral group potentially infecting 3 different families are shown in a and 

b, respectively. Alignments were performed by mapping all the matches 
(48 for a, and 128 for b; named with an IMG scaffold identifier) to a viral 


representative using the ‘map to reference’ package of Geneious software 
(http://www.geneious.com). Black lines represent 100% sequence identity 
to the reference virus. The location of the 3 spacers (that derived from 2 
different phyla) in a as well as the 7 spacers (that derived from 3 different 
families) in b is indicated with triangles with different colours. Spacer 


sequences, as well as the genomes that contain them in a CRISPR locus is 
boxed at the bottom. 
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Extended Data Figure 7 | Habitat type specificity of all viral diversity 
and specific examples. a, Distribution of the presence of the total viral 
diversity of metagenomic viral contigs (viral groups and singletons) across 
distinct number of habitat types. A total of 85.9% of all viral diversity 
resided in a single habitat type (either as a singleton 19.7%, as a viral group 
found in a single sample 1.8%, or as a viral group found in 2 or more 
samples 64.4%), whereas only a small fraction (0.31% of all mVCs) were 
found in 4 or more different habitat types. b, c, Examples of viral groups 
found in diverse samples across different oceanic zones and provinces. 
Presence of a single viral group across distinct marine samples based on 
average coverage values (red bars; y axis on the left) and total percentage of 
the viral sequence length recovered per sample (purple line; y axis on the 
right). Samples were grouped by marine zones and indicate the percentage 
of the total samples per zone. b, Representative of viral group 2463 (IMG 
taxon id: 3300001450 and IMG scaffold id: JG124006J15134_100002847) 
was found exclusively in marine biomes at depth and with reduced 
exposure to sunlight (across 95% of all twilight samples and in 44% of 
deep ocean samples). c, Representative of viral group 10643 (IMG taxon 
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id: 3300000216 and IMG scaffold id: SI53jan11_150mDRAFT_c1002499) 
detected preferentially across coastal water samples (28% of all samples 
of this zone, preferentially in oxygen minimum zones), but also present 
in twilight, deep ocean, and hydrothermal vent samples. This viral 

group was identified as a SUP05-infecting phage. The genes of the viral 
contig representatives were coloured by the phylogenetic distribution 

of the best hit in the database (white, unknown; green, Proteobacteria; 
blue, Chlorophyta, red, unclassified virus; turquoise, Firmicutes; purple, 
Deinococcus). d, e, The distribution of viral sequences of distinct body 
sub-sites across different individuals. Hierarchical clustering (average 
linkage using Jaccard distance) was used for both axes (samples and 
individuals) across ‘large intestine’ (d) and ‘oral’ metagenomes (e), 
respectively (top chart in both panels). Presence or absence of viral groups 
or singletons per sample is colour-coded as red or blue, respectively. The 
line chart of both panels show the percentage of viral sharing for >50%, 
50-10%, and <10% of the individuals (vertical lines) highlighting in red 
boxes the percentage of viral sharing for >80% as well as viral sequences 
only present in a single individual. 
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Extended Data Figure 8 | Alignment of broad-host specificity 
prophage in 20 isolate genomes in IMG using ‘Gene Neighborhood’ 
tool. The gene ‘adenine-specific DNA methyltransferase’ is used as an 
anchor for the alignment (in red). Genes are coloured according to COG 
cluster annotation, with light yellow representing genes without COG 
assignment. Blue boxes highlight likely cargo genes inserted in prophage 
genomes. These include: (1) alkyl hydroperoxide reductase system in 
Dehalogenimonas lykanthroporepellens, Desulfococcus biacutus and 
Geobacter sulfurreducens, (2) efflux ABC transporter in Desulfoarculus 


baarsii and Desulfobacterium anilini, (3) possible secondary metabolite 
biosynthesis genes in Desulfovibrio aespoenensis, (4) restriction system in 
Desulfovibrio paquesii and Geoalkalibacter subterraneous, (5) methionine 
synthase in Desulfovibrio sp. L21-Syr-AB, (6) molybdate ABC transporter 
in Desulfomicrobium thermophilum, (7) ABC transporter involved in 
multi-copper enzyme maturation in Desulfovibrio alkalitolerans; and 

(8) likely antibiotic resistance cassette in Geobacter soli. Details in 
Supplementary Table 24. 
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Extended Data Figure 9 | Distribution of hits to broad-host prophage 
and its potential hosts in metagenomic samples. The hits to prophage 
sequences and host marker genes (RNA polymerase subunits and 
ribosomal proteins) were identified by BLASTn with e-value 1.0 x 10° 
90% nucleotide identity and cumulative alignment length of at least 10% 
of the length of the prophage or concatenated marker genes. Metagenome 
samples grouped by habitat are shown on the y axis; boxes correspond to 
broad environmental categories. Red box surrounds non-human host- 
associated samples (worm and termite symbionts), green box surrounds 
environmental samples (aquatic and terrestrial), blue box surrounds 
engineered samples (wastewater and bioreactors). Average coverage of the 
prophage and concatenated host marker genes is plotted on the x axis. 
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10 pixels (area shown as a red square in the figure) are connected. The colours 
of the samples (circles) indicate the habitat type according with the legend. 

A freely available equirectangular projection of the world map was used as a 
background image (http://visibleearth.nasa.gov/view.php?id—57752). 


Extended Data Figure 10 | Global connectivity of viral diversity from 
different habitat types. Geographic location of metagenomic samples 
containing the same viral groups and singletons represented by a white 
connecting line across metagenomes from different habitats. Only samples 
sharing 2 or more viral groups or singletons that are more distant than 
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Molecular basis of APC/C regulation by 
the spindle assembly checkpoint 


Claudio Alfieri!*, Leifu Chang", Ziguo Zhang!, Jing Yang!, Sarah Maslen!, Mark Skehel! & David Barford! 


In the dividing eukaryotic cell, the spindle assembly checkpoint (SAC) ensures that each daughter cell inherits an identical 
set of chromosomes. The SAC coordinates the correct attachment of sister chromatid kinetochores to the mitotic spindle 
with activation of the anaphase-promoting complex (APC/C), the E3 ubiquitin ligase responsible for initiating chromosome 
separation. In response to unattached kinetochores, the SAC generates the mitotic checkpoint complex (MCC), which 
inhibits the APC/C and delays chromosome segregation. By cryo-electron microscopy, here we determine the near-atomic 
resolution structure of a human APC/C-MCC complex (APC/C™). Degron-like sequences of the MCC subunit BubR1 
block degron recognition sites on Cdc20, the APC/C coactivator subunit responsible for substrate interactions. BubR1 also 
obstructs binding of the initiating E2 enzyme UbcH10 to repress APC/C ubiquitination activity. Conformational variability 
of the complex enables UbcH10 association, and structural analysis shows how the Cdc20 subunit intrinsic to the MCC 
(Cdc20™) is ubiquitinated, a process that results in APC/C reactivation when the SAC is silenced. 


The fidelity of chromosome separation at each cell division cycle 
ensures the inheritance of the correct complement of genetic material 
in successive generations of cells. The APC/C (also known as the 
cyclosome) initiates sister chromatid separation by controlling the 
proteasomal degradation of securin and cyclin B'”. Their degrada- 
tion allows separase to remove sister chromatid cohesin. Crucial to the 
maintenance of chromosome integrity of dividing cells is the SAC**. 
The SAC responds to unattached kinetochores by generating the MCC, 
which functions to suppress APC/C-catalysed ubiquitination of securin 
and cyclin B. 

Although the components of the SAC machinery are known®*, 
and some details of the molecular events that sense the absence of 
kinetochore attachment (and possibly intra-kinetochore tension) 
to signal MCC assembly have been characterized, important 
questions still remain*. Intrinsic to this process is the conversion of 
open (O-) to closed (C-) Mad2, catalysed by a C-Mad2-Mad1 com- 
plex at unattached kinetochores’~®. Soluble C-Mad2 engages the 
N terminus of Cdc20 (refs 10, 11), the mitotic activating subunit 
of the APC/C, which then binds the BubR1-Bub3 dimer to form 
the MCC!?. Mad2 and BubRI1 (also known as Mad3) interact 
cooperatively with Cdc20 (refs 9, 13-18), and synergistically inhibit 
the APC/C during mitosis!*’. 

In an important advance, it was proposed’, and shown”, that the 
tetrameric MCC inhibits the APC/C already in complex with the 
regulatory subunit Cdc20 (APC/C@), which recognizes destruction 
box (D-box), KEN box and ABBA motif degrons of APC/C substrates 
and promotes the catalytically active conformation of the APC/C?!”*. 
In addition to inhibiting the APC/C, the MCC contributes to APC/C 
reactivation after SAC silencing through proteasome-catalysed Cdc20 
degradation?**, SAC-mediated Cdc20 proteolysis is dependent on 
the APC/C, Mad2 and BubR1 (refs 17,23-27), suggesting that Cdc20 
ubiquitination occurs in the context of the APC/CMCC, an idea 
supported by findings showing that release from mitotic arrest, concom- 
itant with Cdc20 destruction, requires the small APC/C subunit Apc15 
(refs 28-30). 

To obtain insights into reciprocal APC/C and MCC regulation, we 
reconstituted recombinant complexes of APC/CM@ and APC/CMCC 
plus the APC/C initiating E2 enzyme UbcH10 for structural and 


biochemical analysis. From a cryo-electron microscopy (cryo-EM) 
reconstruction of the APC/CMCS, we identify conformational vari- 
ability of the complex that explains its capacity to repress substrate 
ubiquitination, but also enables UbcH10 to catalyse intramolecular 
Cdc20M ubiquitination. 


Reconstitution and overall features of APC/CMC 

We reconstituted recombinant APC/CM using the insect cell/ 
baculovirus expression system. Recombinant APC/C™ incorporates 
two distinct Cdc20 subunits, termed Cdc204?°/© and Cdc20™€ for 
the APC/C@*°_associated and MCC-associated subunits, respecitvely 
(Extended Data Fig. 1a, j and Extended Data Table 1), consistent with 
previous findings”®. We determined negative-stain and cryo-EM 
reconstructions of the APC/CM© complex (Extended Data Table 2). 
The negative-stain EM reconstruction of the recombinant APC/CM©& 
is essentially identical in structure to endogenous APC/CM isolated 
from checkpoint-arrested HeLa cells determined at a similar resolu- 
tion?! (Extended Data Fig. 2b). This substantiates the model that the 
physiological form of APC/CM“ includes two Cdc20 subunits*”°. 
Both reconstructions feature a large density element termed the MCC- 
Cdc204?°© module (MCC interacting with the Cdc204°°© subunit of 
APC/C“°) occupying the APC/C central cavity, extending from the 
‘front’ side of the platform domain (Extended Data Fig. 2b). 

To understand quantitatively how the MCC interacts with APC/ C0de20 
we determined a cryo-EM reconstruction of APC/CM at near-atomic 
resolution (Fig. la, b, Extended Data Figs 2c-e, 3 and Extended Data 
Table 2). Extensive 3D classification of APC/CM revealed confor- 
mational variability of the MCC-Cdc204?°© module. This module 
adopts a stable, rigid conformation in 21% of APC/CM particles 
(defined as APC/CMCClese¢) (Extended Data Fig. 4b, class 1). A local 
resolution map of APC/CMC©«'ese¢ shows that the central rigid regions 
are at 3.9-4.1 A resolution, with the flexible outer regions at lower 
resolution (Extended Data Fig. 3c, d). We built an atomic model of 
APC/CMCC-<losed (Fig, 1 and Supplementary Video 1) guided by the 
cryo-EM structure of human APC/C“"!£™! (ref, 32) and the crystal 
structure of fission yeast MCC’. 

In the APC/CMCClesed reconstruction, the secondary structure 
elements of the MCC-Cdc204"°/© module are clearly visible 
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Figure 1 | Overall structure of the APC/C’“© complex. a, b, Two views 
of APC/CM°°. The MCC-Cde204"°'© module is shown as a cartoon and 
the APC/C in a surface representation. BubR1 forms extensive contacts 
with Cdc204°°/© and Apc2. BubR1 inhibitory degrons visible in these 
views are highlighted. 


(Extended Data Fig. 2e), revealing that Cdc204P°'©, Cdce20M°°, Mad2 
and the N-terminal and middle regions of BubR1, including its 
tetratricopeptide repeat (TPR) domain (BubR17?®) and its inhibitory 
degrons, are well defined (Figs 1a, b and 2). However, despite their 
presence in reconstituted APC/CM (Extended Data Figs la and 2a), 
the C-terminal pseudo-kinase domain of BubR1 (Fig. 2a) and Bub3, 
were not visible in EM density, indicative of conformational variability. 
In agreement with this, both the structure and activity of APC/CM°C 
with Bub3 and the BubR1 C terminus deleted (APC/C™™MCC) 
are indistinguishable from APC/ CMCC (Extended Data Figs 1b, h, i 
and 2b). These data are consistent with the requirement of the 
N-terminal ~363 residues of BubR1 (Fig. 2a) to sustain a SAC#%*. 


General architecture of APC/CM 

In APC/CMCS, the MCC core elements comprising Cdc20™C°, 
BubR1!?® and Mad2 resemble their counterparts in Schizosaccharo- 
myces pombe MCC’ (Fig. 1 and Supplementary Video 1). Mad2 adopts 
the closed conformation (C-Mad2) with its C-terminal segment (‘safety 
belt’) locking the tubular density of the Cdc20M°° KILR motif! 
(Figs 1, 2c and Extended Data Fig. 5c). The MCC docks into the 
APC/C“*° cavity directly below Cde204?©, and interacts with 
Cdc20“PC'" such that the two Cde20 WD40 domains of APC/CM are 
arranged in an almost perpendicular fashion (Figs 1 and 2c). Cdc20M°° 
and BubR1 mediate interactions between the MCC and APC/C@9, 
with BubR1 dominating these interactions through its contacts 
to Cdc204"°'© and Apc2. C-Mad2 forms no direct contacts with 
APC/C“*° (Fig. 1). However, by stabilizing the association of BubR1 
with Cdc20M°°, C-Mad2 indirectly augments BubR1 interactions with 
APC/C“40 (refs 14, 19, 35). APC/CMCC-losed is similar in structure 
to APC/C and coactivator complexes”*”’, with major conformational 
differences confined to Cdc204?°“ and the platform subunits Apc4, 
Apc5 and Apc15 (Extended Data Fig. 5d, e). 

Contacts between the two Cdc20 subunits of APC/CMO@se¢ are 
mainly mediated by BubR1 that intertwines between them (Fig. 2c). 
In response to MCC binding, the WD40 domain of Cdc204?P°“ is 
tilted by ~40° and rotated 90° about its central axis”? (Extended Data 
Fig. 5d). This disrupts the D-box-binding site formed by the interface 
of the D-box co-receptors of Cdc204°O/ WP” and Apc 10 (refs 21, 32), 
and pulls the C-terminal isoleucine-arginine motif (IR tail) of 
APC/CC40 away from the IR-tail-binding site of Apc3A. A disengage- 
ment of the Cdc204?'© TR tail from Apc3A is consistent with weak EM 
density at the IR-tail-binding site (Extended Data Fig. 5a), and the 
finding that Cdc20 does not require Apc3 to bind the APC/C when 
the SAC is active*”. In contrast to the IR tail, the N-terminal domain 
of Cdc204"°/© maintains the same interactions with Apc8B and 
Apcl? © as seen in APC/C@*? (ref. 22) (Fig. 3a and Extended Data 
Fig. 5b). 
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Figure 2 | Interactions of BubR1 with Cdc204?'© and Cdc20™©. 

a, Schematics of BubR1 (top) and Cdc20 (bottom). b, Schematic 
representation of the top views of the Cdc204" and Cdc20M°° WD40 
domains. WD40 domain blades are numbered and the positions of BubR1 
inhibitory degrons (orange) are indicated. The CRY degron*? mediates 
Cdc20M interactions with Cdc204?°/© (Extended Data Fig. 5f). c, Two 
views showing details of the MCC-Cdc20“?°'© module. Cryo-EM density 
of the BubR1 inhibitory degrons, Cdc20M° CRY box and KILR motif is 
shown. Interactions of the BubR1 A1 motif with the Apc10?->ex coreceptor 
and Apel; BubR1!?8 with Apc2™#; and Cdc20MC with Apc4#®> are 
indicated (bottom). Inset, overall view of APC/CM®S, 


BubR1 interacts with both Cdc20 subunits of APC/CMCC 

Similar to yeast MCC?®, in APC/CM, the N-terminal KEN-1 box 
(K1) (Fig. 2a and Supplementary Video 1) engages the KEN-box 
recognition site of the Cdc20M©° (Fig. 2b, c). Immediately after 
BubR1!?8 and preceding KEN-2 (K2) is the N-terminal D-box motif 
(D1)” (Fig. 2a). We assigned D1 to the loop-like density at the D-box 
recognition site of Cdc20“?'S although owing to the re-orientation 
of Cdc20“?°'C in APC/CM, its position is diametrically opposed to 
its location in active APC/C-coactivator complexes”)? (Extended 
Data Fig. 5d). D1 and its flanking residues mediate the interface of 
Cdc204°°'© and Cdce20M° (Figs 1 and 2c, bottom). This explains 
how mutating D1 disrupts MCC binding to a second Cdc20 subunit 
without affecting MCC integrity”. By engaging the Cdc20“?°/" D-box 
co-receptor in APC/C™“, D1 would obstruct D-box-dependent 
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substrate recognition, consistent with its requirement for a checkpoint 
arrest?”. 

EM density immediately C-terminal to D1 is disordered (Fig. 2c, 
bottom); however, uninterrupted tubular density situated nearby and 
assigned to BubR1 is clearly defined wrapping around the opposite 
side of the Cdc20“?©/© WD40 domain, connecting its bottom and 
top surfaces and contacting both the Cdc204"C/© KEN and ABBA 
motif-binding sites (labelled K2 and A1 in Fig. 2c). This BubR1 EM 
density feature bears a notable resemblance to the structure of Acm1 
(APC/C-Cdh1-modulator 1), an inhibitor of yeast Cdh1 that uses 
ABBA, D-box and KEN motifs to block the degron-recognition sites 
of Cdh1 (ref. 38) (Extended Data Fig. 6a). The BubR1 EM density 
contacting the top surface of Cdc204?°/© corresponds to the KEN-2 
box (Fig. 2c, top). Guided by the Cdh1-Acm1 crystal structure*®, we 
modelled the KEN-2 box and the preceding ABBA motif (Fig. 2a and 
Extended Data Fig. 6a, b) to the KEN box and ABBA motif recognition 
sites of Cdc20“"°" (Fig, 2c). 

The KEN-2 motif is invariant in BubR1 orthologues (Extended Data 
Fig. 6b), and although not required for MCC assembly!®!863639) it is 
essential for a spindle checkpoint arrest!®°3°?. Thus, similar to D1, a 
role for KEN-2 in mediating MCC interactions with APC/C“”° would 
explain its requirement in the SAC by stabilizing MCC interactions 
with Cdc204?°/© (ref. 20) and inhibiting degron recognition by 
APC {C420 

A middle segment of metazoan BubR1] includes an ABBA motif (A2) 
and a D-box (D2) (Fig. 2a) that mediate Cdc20-BubR1 interactions in 
a Mad2-independent manner'*!*“°-, These motifs contribute weakly 
to sustaining the SAC!*4!?, and have a role in recruiting Cdc20 to 
unattached kinetochores*”-’. The tubular EM densities located at the 
ABBA motif and D-box recognition sites of Cdc20M° were assigned 
to these motifs (Fig. 2c, top). 
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Figure 3 | Interactions of the MCC-Cdc204?°'© module with the 
APC/C, and APC/C catalytic inhibition by the MCC. a, Right, an 
overview of the APC/CM© model with the corresponding cryo-EM 
density. Left, segmented cryo-EM density of the Apc8A-Apc8B dimer and 
its two associated Cdc20 molecules. Cdc20“?“'" interacts with Apc8B via 
its N-terminal domain (NTD). Cdc20M interacts with Apc8A through 
its C-terminal Ile-Arg (IR) tail. b-d, Comparison of the binding mode 

of BubR1 and Cdc20MOC-WP in APC/CM with the binding mode of 
UbcH10 in APC/CO4h! UbcH10-Ub (ref, 32). b, Segmented cryo-EM density 
of Cdc20MS, BubR17?8, Apc4#8P and Apc2”#®, ¢, APC/CCdh UbcH10-Ub_ 
d, Both structures were superposed. BubR17?® and Cdc20MCC-WP 
compete for the same binding surfaces on Apc28 and Apc4#®° that 
form the UbcH10-binding site in APC/CC4™! Ub-H10-Ub (Extended Data 
Fig. 5g). The Apc24 sub-domain of Apc? is shifted in the APC/CMC° 
complex relative to APC/C42!-UbcH10-Ub and would clash with the 
UbcH10-binding site on Apc11®NS, 


ARTICLE 
Cdc20™ contacts to the APC/C 


Although Cdc20M interactions with the APC/C are mainly mediated 
through BubR1 and APC/C“°, two additional contacts are notable. 
First, the C-terminal IR tail of Cdc20M binds to a site on Apc8A that is 
structurally equivalent to the Cdc20“P“/© C-box binding site on Apc8B” 
(Fig. 3a and Supplementary Video 1). This interaction can be ration- 
alized from the similarities in how the Apc3 IR tail and Apc8 C-box 
binding sites interact with their cognate IR tail and C-box motifs, respec- 
tively”. Second, the lower surface of the Cdc20“°° WD40 domain inter- 
acts with the conserved acidic region of the Apc4 helix-bundle domain 
(Apc4#®), close to the UbcH10-binding site (Figs 2c and 3a, b). 
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Figure 4 | Cryo-EM structures of APC/CMCCoren , APC/CAAPAS Mee 
and APC/CUPH10-MCC and comparison with APC/CMCC: rclosed 4 Overall 
view of the cryo-EM density of APC/CMCC-losed and fitted coordinates 

for the MCC-Cdc204°°/© module, Apc2"8, Apcl1, Ape4 and Apc5. 

The APC/CM€ subunits are coloured as in Figs 1 and 3. b, Details of the 
Apc15NT#_binding site on Apc8A and Apc5. Apc8A and Apc5 are shown. 
The position of the disordered Apc15N™ is indicated by a box. c, d, APC/ 
CAAPc15-MCC complex (Apcl5 deleted). e, f, APC/CMC™P"" complex. 

g, h, APC/CUPHIO-MCC complex. In APC/CMCC-<lesed (a,b), BubR1T?® 
interacts with Apc2“™®, and Apc15N"™ is disordered. In APC/CA4Pe15- MCC 
(c, d), the MCC-Cdc20“?“'© adopts the closed conformation, blocking the 
catalytic module. Conversely in APC/CMCC°P* (e, f) and in APC/CUPH1O- 
MCC (gh), BubR17?® no longer interacts with Apc2"®, and Apc15N™ is 
ordered. g, In APC/CUPHIO-MCC, Ane 2WHB and Apcl1®/NS interact with 
UbcH10. All cryo-EM reconstructions were filtered to 8.5 A. 
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The MCC suppresses APC/C E3 ligase activity 

UbcH10 interacts with the APC/C through the RING domain of Apcl1 
and the winged-helix B (WHB) domain of Apc2 (Apc2W#8)32.44 Tn 
APC/CMCCclosed, BubR1!?8 occludes both E2-binding sites (Figs 1 
and 3). BubR1'?® interacts directly with the UbcH10 interface of 
Apc2™#8 that repositions to contact BubR17?® (refs 32, 44) (Fig. 3b-d 
and Extended Data Figs 5g and 6c). Thus, in APC/CMCC“lose4, the steric 
occlusion of UbcH10 binding would inhibit ubiquitin chain synthesis. 


Apcl5-dependent conformational flexibility of the MCC 

The APC/CMCC-losed structure explains how the MCC inhibits APC/ 
C&*0 by obstructing substrate and UbcH10 interactions. However, 
Cdc20 auto-ubiquitination is the trigger for the spontaneous reactivation 
of the APC/C?3-?%45, dependent on the small APC/C subunit 
Apcl5 (refs 28, 29). In previously reported APC/C structures”**?, 
Apcl5 adopts an extended conformation, anchored to Apc5 by its 
N terminus, and bridging Apc5 and Apc8A through its adjacent 
N-terminal helix (Apc15N"™), Notably, in APC/CMOClosed, Apc 5NTH 
is structurally disordered (Fig. 4a, b). This results from Cdc20M°C 
shifting the tip of Apc4#®° that concomitantly repositions the adjacent 
N-terminal domain of Apc5 (Apc5"), disrupting its interaction with 
Apcl5NTH (Fig. 4a, b, Extended Data Fig. 5d, e and Supplementary 
Video 2). The disorder of Apc15N™# and repositioning of Apc4#8> 


APGC/CUbcHt0-mcc 
- i a 


pu 


“Gde2oMec 


Ubiquitin ae UbcH10.,. 


and ApcSN?P in APC/CMCC-<lesed contrasts with their conformations 
in another structural state representing only 2% of APC/CM particles 
(termed APC/CMC©°Pren) (Methods, Extended Data Fig. 4a, b and 
Extended Data Table 2). In APC/CMCCPe", EM density assigned to 
the MCC-Cdc204?“© module is weaker relative to APC/CMCC<esed, 
However, EM density for Apc15N" is clearly defined (Fig. 4e, f). 
Moreover, in APC/CMCC-Pe", MCC-Cdc204?'C is shifted towards 
the TPR lobe of APC/C, disrupting contacts between the MCC and 
the catalytic module, Apc4"¥®P and Apcl0. The ‘oper’ position of 
MCC-Cdc20“?'" is stabilized by Apc15N™ interacting with ApcSN!™ 
that pushes onto the Cdc20MC°-binding site of Apc4#® (Fig. 4f and 
Extended Data Fig. 5e). Thus, the stability of Apc15N"™ influences 
the transition between open and closed conformations of the MCC- 
Cdc204"°/© module. 

To investigate further how Apcl5 modulates the position of MCC- 
Cdc204°C'C, we determined the structure of APC/CMC° with Apc15 
deleted (APC/C““P"5) (Extended Data Table 2, Fig. 4c, d and Extended 
Data Fig. 3). APC/C4P*!5 assembled similarly to wild-type APC/C 
and was catalytically active towards securin (Extended Data Fig. 1k, 1). 
However, in contrast to APC/CMS, the reconstituted APC/CA4P¢5 Mec 
was defective for Cdc20 auto-ubiquitination, consistent with pre- 
vious reports**”? (Fig. 5c and Extended Data Fig. 1h). 3D classi- 
fication of the APC/C44P¢!5-MCC cryo-EM data set showed that the 


Figure 5 | Mechanism of Cdc20 auto- 
ubiquitination by APC/CUH19-MCC 4 bh, Model 
of a Cdc20M ubiquitination complex based on 
the APC/CUPH10-MCC cryo-EM reconstruction. 
The UbcH10-ubiquitin conjugate is modelled 

in the closed conformation™. b, Top, cryo-EM 
density and model of APC/CUHIOMCC The EM 
map is filtered to 12 A and displayed at slightly 
lower threshold than in Fig. 4g, see Extended 
Data Fig. 3e, for comparisons. Clear EM density 
connects Cdc20™©© with UbcH10. Bottom, the 
Cdc20MS pre-ubiquitination model. Cdc20M°C 
residues Lys485 and Lys490, ubiquitinated 

in logarithmic and checkpoint-arrested cells, 
respectively”, are in close proximity to the 
UbcH10 catalytic site (red sphere). c, Apc15 is 
required for Cdc20 ubiquitination by recombinant 
APC/CMC©, d, BubR1“™ mutations at the 


fedelles) “Cdca20"ee Ubiquitin Apc2#® interface (RI69A, F175A,V200A and 
c di APC/CMCS APC/CMCC-BuDRIWm —@ P _ _ — ApcycMee L205) (Extended Data Fig. 5g) stimulate Cdc20 
JARC/OINS “ARCICNES a 0 15 30 0 15 30 min _ n : — apc/cMec ubiquitination. e, Cdc20 residues Lys485 and 
a oo SSO Semin 2 _ i ape jgunicieuee aa Lys490 are ubiquitinated by recombinant APC/CMC° 
188— a , “a - - - + APC/CAPCISANTH-MCC (compare lanes 2, 3 and 4, 5). Apc15N™ is 
a — 0 30 60 30 60 30 60 30 60 min required for Cdc20 ubiquitination by recombinant 
- a APC/CM (lanes 8, 9). f, Cartoon illustrating 
eS 188 — reciprocal regulation of APC/C and MCC by 
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| ee Nek2A can bypass the SAC. In APC/CMC@epe, 
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Experiments in c-e were replicated three times. 
See Supplementary Fig. 1 for gel source data. 
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MCC-Cde204?'© module adopts only the closed configuration (Fig. 
4c, d, Extended Data Fig. 7b and Supplementary Video 2), with no 
evidence of APC/CMCC-oPe" (Extended Data Fig. 7a, classes 1 to 6). 
Thus, the absence of Apc15 promotes APC/CMCClosed, explaining why 
Apcl5 is necessary for Cdc20 ubiquitination. In support of our struc- 
tural data that Apc15N7# js required to stabilize APC/CMC°P*, simi- 
lar to APC/CAAPC!S-MCC 4 pC/CMCC reconstituted with Apcl5 lacking 
the Apc15N™ (Apc154N") is defective in Cdc20 auto-ubiquitination 
(Fig. 5e, lanes 8 and 9, and Extended Data Fig. 1h). 

Since interactions between Apc2™8 and BubR1 stabilize APC/ 
CMCC-cosed (Rigs 1 and 3b), disrupting this interface should favour 
APC/CMCCpen. Consistent with this idea, negative-stain EM recon- 
structions of an APC/CM®° mutant with Apc2“" deleted showed 
APC/C™ adopting the open conformation, with no APC/CMCC dosed 
(Extended Data Figs 1c and 2b). Importantly, in the complementary 
Apc2#8_binding surface mutant of BubR1 (Extended Data Fig. 5g), 
Cdc20MC° auto-ubiquitination is stimulated (BubR1”™, Fig. 5d and 
Extended Data Fig. 1h). 


Mechanism of Cdc20 auto-ubiquitination 

Cdc20“ auto-ubiquitination catalysed by APC/CM°°-UbcH 10 is an 
intra-molecular process reliant on the APC/CMC©°P®? conformation. To 
explore this possibility further, we determined the cryo-EM structure 
of APC/C™ in complex with UbcH10 (APC/ CUbcH10-MCC) (Extended 
Data Table 2). A 3D reconstruction using 7% of total particles showed 
clear EM density for both UbcH10 and the MCC-Cdc20“?/" module 
(Extended Data Fig. 7c). The resultant EM map at 8.9 A resolution 
allowed rigid body docking of the APC/C, UbcH10 and MCC- 
Cdc20°"'" coordinates. APC/CUH!9MCC resembles APC/CMCCopen 
(Figs 4e-h, 5a, Extended Data Fig. 3 and Supplementary Video 2). 
Apc15NT is ordered and MCC-Cdc20°?“ is rotated towards the 
TPR lobe leaving the catalytic module accessible to bind UbcH10 
(Fig. 5a). 

Notably, in APC/CUbcH10-MCC [jhcH10 induces an additional 
small rotation of MCC-Cdc204?C/”, The C terminus of Cdc20! is 
visualized as an extended tubular density feature linking the Cdc20MCC 
WD40 domain with the catalytic site of UbcH10 (Fig. 5b, Extended 
Data Fig. 3e and Supplementary Video 2). Modelling the C terminus of 
Cdc20M into this density shows that Lys485 and Lys490 are accessible 
to the UbcH10 catalytic site (Fig. 5b). Both residues are ubiquitinated 
in vitro (data not shown), and their replacement by Arg virtually 
eliminated Cdc20 auto-ubiquitination (Fig. 5e), indicating that these 
two residues are the major sites of Cdc20M“ ubiquitination in the 
context of APC/CMCC-UbH10 Our model, that the MCC competitively 
restricts access of UbcH10 to its binding site on the APC/C, is consistent 
with the reduced binding of UbcH10 to endogenous APC/C™ relative 
to APC/C@4 (ref. 31), and our finding that compared with securin 
ubiquitination, Cdc20 auto-ubiquitination required a tenfold higher 
concentration of UbcH10 (Extended Data Fig. 1m), is in agreement 
with previous findings”’. Although our data show that MCC suppresses 
APC/C-UbcH 10 interactions, the effect of the MCC on modulating the 
affinity of the APC/C for the elongating E2 Ube2S, which contributes 
to checkpoint silencing**“S, is less clear. The binding site for the 
Ube2S C-terminal LR tail at the Apc2—Apc4 interface’, necessary for 
its association with the APC/C, is unaffected by the MCC (Fig. 5a), 
thus in principle allowing Ube2S to assemble ubiquitin chains onto the 
C terminus of Cdc20M“°. However, the capacity of Cdc20 to stabilize 
APC/C-Ube2S complexes** may be affected by the interactions of 
BubR1 with the APC/CM°, 

In conclusion, our structural analysis of APC/ CMCC and APC/ 
CUbcH10-MCC provides a molecular explanation for the spontaneous 
APC/C reactivation resulting from SAC inactivation involving 
ubiquitin-mediated Cdc20M© proteolysis. APC/C reactivation in 
response to the cessation of SAC signalling is facilitated by the reciprocal 
control of APC/C and MCC activities mediated by the conformational 
flexibility of APC/CM, influenced by Apc15. Modulation of this 


ARTICLE 


conformational change could allow for the regulation of Cdc20M°° 


auto-ubiquitination (Fig. 5f). A candidate for such a role is p31°™", 
a Mad2-binding protein proposed to stimulate Cdc20 ubiquitination 
in checkpoint-arrested cells”*“°. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment. 

Cloning and expression of human APC/C complex genes. The DNA coding 
sequences (CDSs) of the human APC/C subunits (wild-type, mutant Apc24W"8, 
Apcll-UbcH10 fusion and AApc15) were assembled by USER cloning into a 
modified version of the insect cell-baculovirus MultiBac expression system>**>?, 
All APC/C subunit CDSs were distributed in two recombinant vectors that were 
used for recombinant baculovirus generation. For APC/C expression, Hi-5 cells 
at a density of 2 x 10° cells ml”! were co-infected with two pre-cultures of Sf9 
cells each pre-infected with one of the two recombinant APC/C baculoviruses. 
APC/C expression (unphosphorylated) was performed for 30h. To obtain 
APC/C (phosphorylated APC/C), okadaic acid at a final concentration of 0.1 1M 
was added after 24h of infection. Cells were collected after 5h of treatment. 
Cloning and expression of human MCC complex genes and Cdc20. The CDSs 
of the human MCC subunits (Mad2, Cdc20, BubR1 and Bub3) used for structural 
analysis were cloned into a pU2 plasmid*? using the same method as for the 
APC/C. BubRI1 was fused in frame with an N-terminal 3 x Flag tag. Cdc20 for 
individual expression was cloned into a pFastbacl HTA in frame with the Hisg- 
tag. In addition, a maltose-binding protein (MBP) tag, followed by a TEV site 
between the starting codon of Cdc20 and the N-terminal His, tag, was added 
by restriction free cloning method (RF-cloning*’). To obtain a vector containing 
Mad2, Cdc20 and BubRI1 (residues 1-569) CDSs (miniMCC construct), a Mad2- 
and Cdc20-containing expression cassette from a pU1 vector was shuttled (by the 
AvrII and Pmel sites) into a pFastbacDual vector (BstZ171 and Spel sites) that 
contained 3 x Flag~BubR1'° under the control of the p10 promoter. A C-terminal 
StrepIIx2 tag was added by RF-cloning into the BubR1 constructs used in ubiquit- 
ination assays. Expression of either the MCC or Cdc20 constructs was performed 
similarly to the APC/C (unphosphorylated) to avoid CDK-dependent inhibition of 
APC/C-Cdc20 interactions***°. Moreover, cells were collected 48 h after infection. 
To express MCC complexes with the tagged versions of BubR1, virus containing the 
BubR1-StrepII constructs was co-infected with MCC virus. To express the MCC 
complex with the Cdc20%48°2K4°R mutations, viruses containing the individual 
MCC subunits were used for co-infection. Apc154N™, a mutant form of Apcl5 
with a (Gly-Ser-Ala)¢ linker substitution of the N-terminal helix (NTH: residues 
23-57) was cloned into an Escherichia coli pOPIN expression vector and purified 
using a C-terminal StrepII? tag. 

Reconstitution and purification of AP’ complexes. To generate mitotic 
phosphorylated APC/C (APC/C®) we incubated APC/C expressing insect cells 
with the phosphatase inhibitor okadaic acid (OA) (as described above). The extent 
of APC/C phosphorylation was monitored by assessing the migration of the Apc3 
subunit on SDS-PAGE” (Extended Data Fig. la, f). The recombinant APC/CO* 
was phosphorylated on ~110 sites (Extended Data Table 3), correlating closely with 
those previously identified in endogenous APC/C isolated from HeLa cells arrested 
by the mitotic checkpoint****, and with sites phosphorylated in vitro by the mitotic 
APC/C activating kinases Cdk2-cyclinA2-Cks2 and Plk1 (ref. 22) (Extended Data 
Table 3). Compared with APC/C from untreated insect cells, and using Cdc20 as 
the coactivator, APC/C readily ubiquitinates securin (Extended Data Fig. 1g, h). 

The APC/CMC complex was reconstituted by co-lysing APC/C“ expressing 
cells with insect cells expressing separately MBP-tagged Cdc20 and the MCC 
(BubR1, Bub3, Mad2 and untagged Cdc20). Hi-5 cell pellets expressing either 
APC/C™ or MBP-Cdc20 or MCC were mixed together in reconstitution 
buffer containing 50 mM Hepes (pH 8.2), 150 mM NaCl, 5% glycerol, 0.5 mM 
TCEP, 1 mM EDTA, 0.1mM PMSF, 2mM benzamidine, 5 U ml“! benzonase 
(Novagen), Complete EDTA-free protease inhibitors (Roche), 50 mM NaF, 20mM 
68-glycerophosphate and 0.1 1M okadaic acid. After complete mixing the cells were 
co-lysed by sonication and the lysate was centrifuged for 60 min at 17,000g. The 
soluble fraction was loaded onto a Strep-Tactin Superflow Cartridge (Qiagen) 
for purification using the StrepIP* tag on Apc4 as described previously”!. The 
eluate was then applied to an anti-Flag M2 Affinity Gel (A220, Sigma) column 
(directed against the N-terminal Flag tag on BubR1) and incubated overnight. 
The APC/C™“ complex was eluted with a 3 x Flag peptide at a concentration of 
501g ml~'. The resulting elution was concentrated to around 1.4mg ml“! and 
run on a Superose 6 3.2/300 (GE Healthcare Life Sciences) gel-filtration column 
pre-equilibrated with gel-filtration buffer containing 20 mM HEPES (pH 8.0), 
150 mM NaCland 0.5mM TCEP. The gel filtration was run on a AKTAmicro (GE 
Healthcare Life Sciences) with a flow rate of 50,1 min”. 

An SDS-PAGE of purified APC/CMCE showed both versions of Cdc20, 
consistent with the incorporation of two distinct subunits of Cdc20 into APC/CMCS 
(refs 2, 20) (Extended Data Fig. 1j). Reconstituted APC/ CMCC is stable and homo- 
geneous as shown by size-exclusion chromatography (Extended Data Fig. 2a). 
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The APC/C4P*!°ANTH complex was reconstituted by incubating recombinant 

APC/C™P¢!5 with Apcl54N", at concentrations of 200 nM and 1M, respectively, 
followed by size exclusion chromatography. Anti-Apc15 antibodies were from 
Santa Cruz Biotechnology (sc-398448). 
APC/C ubiquitination assay. To examine APC/C activity towards securin, the 
ubiquitination assay was performed with 60 nM of recombinant human APC/C, 
150 nM UBAI, 300 nM UbcH10, 300 nM Ube2S, 20|1M ubiquitin, 211M securin, 
5 mM ATP, 0.25 mg ml-! BSA and 7 nM of recombinant human Cdc20. The 
ubiquitination products of securin were detected by western blot with either 
an anti-His antibody (631212; Clontech) or an anti-securin antibody (700791; 
Invitrogen). 

To test the activity of a pre-assembled APC/CM complex towards Cdc20M°° 

(Fig. 5c), ubiquitination reactions were performed with 250 nM of recombinant 
human APC/C©0-MCC and 101M of UbcH10 (40 excess). To test the activity 
of APC/C towards the Cdc20“ from individually purified wild-type and mutant 
MCCHRL-Strepll (Hurification by StrepIF* affinity and gel-filtration columns) 
ubiquitination reactions were performed with 200 nM of recombinant human 
APC/C®4, 200 nM of recombinant human Cdc20 and either 300 or 600 nM of 
recombinant human MCC 8! Stepll (Fig. 5d, e). Either with a pre-assembled APC/ 
CMCC complex or with a molar excess of MCC complex over free Cdc20 and APC/C 
only Cdc20M ubiquitination is promoted (data not shown)”. Cdc20 and the 
ubiquitination products of Cdc20M° were detected by western blot with an anti- 
Cdc20 antibody (Cdc20 H-175 sc-8358; Santa Cruz Biotechnology). 
Electron microscopy. Freshly purified APC/CM samples were analysed by 
negative-stain EM to check the sample quality and to obtain a low-resolution 
reconstruction. Micrographs were collected on a 2kx 2k CCD camera fitted to 
a FEI Spirit electron microscope at an accelerating voltage of 120 kV, operated at 
a nominal magnification of 42,000 with a resulting pixel size of 2.46 A per pixel 
at specimen level. Defocuses were set at approximately —2 1m. Particles were 
automatically selected using the autoboxer program implemented in EMAN*. 
About 150 micrographs per sample were collected yielding ~10,000 particles. After 
3D classification performed with RELION® only the prominent best class (30-40% 
of total amount of particles) was used for auto-refinement and final low-resolution 
structure determination. 

Grid preparation for both negative-stain EM and cryo-EM was performed as 

described previously***!. Cryo-EM micrographs were collected with an FEI Tecnai 
Polara electron microscope at an acceleration voltage of 300 kV and Falcon III 
direct detector. Micrographs were taken using EPU software (FEI) at a nominal 
magnification of 78,000, yielding a pixel size of 1.36 A per pixel at specimen 
level. A total exposure time of 1.6s were used at a dose rate of 27 electrons per 
pixel. Defocus range was set at —2.0 to —4.0,1m. Movie frames were recorded as 
described*”. 
Image processing. Image processing was performed with RELION 1.4 (ref. 60). 
The initial steps including motion correction, CTF estimation, particle picking and 
particles sorting by Z-score and 2D classification were performed as described*’. 
Selected particles were used for a first round of 3D classification with global search 
and a sampling angular interval of 7.5°, using a 60 A low-pass filtered APC/CC¢hL Emit 
EM map asa reference™. Poorly characterized 3D classes, with poorly recognizable 
features, were discarded at this stage and the remaining particles were refined and 
corrected for beam-induced particle motion using particle polishing in RELION™. 
Polished particles were used for another round of 3D classification with a local 
search within 15° and a smaller angular sampling interval of 3.7° (Extended Data 
Figs 4 and 7). The reconstruction generated from all the polished particles, low- 
pass filtered at 40 A, was used as reference. 

To isolate particles for the APC/ CMCC-closed state, classes showing closed-like 
features for the MCC-Cdc204?“© module (for example, proximity to Apc2, Apc4 
and Apc10; Extended Data Fig. 4, classes 1-3) were combined and refined. The 
resultant map was used as reference for a subsequent 3D classification performed 
with a soft edge mask on the MCC-Cdc20“?°/© module (Extended Data Fig. 4). 
The mask was created from a map converted from the fitted coordinates of the 
MCC-Cdc20 module, with three pixel extension and five pixels soft edge width. 
The MCC-Cdc20 module coordinates were created by fitting the MCC core coor- 
dinates and isolated Cdc20 (PDB code 4AEZ)', on the best MCC-Cdc204?°/© 
module density map (Extended Data Fig. 4, class 1). 

To isolate particles for the APC/CMCC-°P& state, classes showing open-like 
features for the MCC-Cdc20 module (for example, proximity to TPR lobe and 
loss of contact with Apc2, Apc4 and Apc10; Extended Data Fig. 4, classes 4-5) 
were refined together. The obtained averaged class was used as a reference for a 
subsequent 3D classification performed with a larger mask (6 pixel extension and 
6 pixel soft edge) created with the MCC-Cdc204"°C'© module coordinates fitted 
into the corresponding density in the APC/CYH9-MCC reconstruction described 
below (Extended Data Figs 4 and 7). 
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To obtain the APC/C44P*!5-MCC structure, the best classes from the 3D 
classification with local searches step were refined together (Extended Data 
Fig. 7a, classes 1-3). 

To isolate the particles for the APC/CU*#10-MCC reconstruction, instead of 
performing the 3D classification with local search steps, an initial classification 
with a large mask (similar to APC/ CMCC-open) was performed. The latter allowed 
the identification of a class that features both the MCC-Cdc20 module and the 
UbcH10-Apcel1-Apc2#®- Ape2®/ 4o™n assembly. A large mask including the 
latter regions was created by fitting the MCC-Cdc204?© module coordinates 
and the UbcH10-Apel1-Apc2W®- Apc2/3 ¢o™n assembly (PDB code 5A31)* 
in the preliminary APC/CUPH!9-MCC reconstruction. The latter mask was used 
for a re-classification of the initial particles and allowed the isolation of the final 
APC/CUHO-MCC particles (Extended Data Fig. 7c). 

All resolution estimates were based on the gold standard Fourier shell 
correlation (FSC) = 0.143 criterion®. Final FSC curves were calculated using a 
soft mask (five pixel extension and three pixel soft edge) of the two independent 
reconstructions. To visualize high-resolution details, all density maps were 
corrected for the modulation transfer-function of the detector and sharpened by 
applying negative B-factors, estimated using automated procedures. 

Local resolution maps for all the cryo-EM reconstructions were calculated with 

RESMAP® using a resolution range between 3.5 and 15 A and displayed with 
Chimera™. For comparing structural features among the cryo-EM reconstructions, 
shown in Fig. 4 and Extended Data Fig. 3, which have different overall resolutions, a 
common filter of 8.5 A was applied. This was selected based on the local resolution 
of the APC/CUPH10-MCC map in the region assigned to Apc15 (the main region 
of relative comparison). APC/CUH10-MCC is the APC/C reconstruction with the 
lowest overall resolution. Filtering all the reconstruction to 8.5 A resolution allowed 
a clear definition of the structural details of Apc15 and other regions without the 
appearance of noise. To visualize the connecting density between UbcH10 and 
Cdc20 the APC/CUH10-MCC map was filtered to 12 A resolution based on the local 
resolution of this area and the threshold was slightly lowered. 
Model building. Initial fitting and superposition of coordinates was performed 
with Chimera. Model building of APC/CM“ was performed in COOT®. APC/C 
platform, TPR lobe, Apc10 and accessory subunit coordinates from the atomic 
structure of APC/C4"! ="! (PDB code 4UI9)” were individually rigid body fit into 
the APC/CMCC-losed cryo-EM density. A few regions such as Apc4™!8P, ApcSNTP 
and Apcl1 were also modified by flexible fitting. The Apc2#8 domain (PDB 
code 4YII)** was rigid body fit into the corresponding density. Cdc204?'© IR tail 
and NTD were rigid body fit from the coordinates of APC/C°4?°"H""! cryo-EM 
structure””. The Cdce20MC IR tail was modelled by superposing the TPR domain 
of Apc3 including Cdc20" from APC/C“*#5!! to the TPR domain of APC/CMCC 
Apc8A. Two copies of human the Cdc20“° domain (PDB code 4GGA)®, human 
C-Mad2 (PDB ID: 2V64)* and the human BubR1™8 domain (PDB code 3SI5)” 
were rigid body fit on the MCC-Cdc20 module density. Cdc20M“° CRY box, 
included in the human Cdc20W?“° domain crystal structure (PDB code 4GGA) 
was modelled by flexible fitting. In addition, the Cdc20 KILR motif was modelled 
by rigid body fit of the MCC core crystal structure (PDB code 4AEZ)'* into the 
corresponding density. A similar procedure was applied to model the first KEN1 
and helix-loop-helix region of BubR1. BubR1 D1 and D2 were modelled by rigid 
body fit of Acm1 D-box 3 (PDB code 3BH6)**. Similarly BubR1 Al and K2 were 
modelled by flexible fitting of the Acm1 region spanning the A-motif and KEN 
box as explained in the main text. BubR1 A2 was modelled as a rigid body fit of the 
Acm1 A-motif. Loop extensions were modelled as idealized polyalanine. 

Model refinement was performed with REFMAC 5.8 (ref. 68). A REFMAC 
weight of 0.04 was defined by cross-validation using half reconstructions”. 
A resolution limit of 4.0 A was used. All available crystal structures or NMR 
structures were used for secondary structure restraints. The refinement statistics 
are summarized in Extended Data Table 2b. 

Map visualization. Figures were generated using Pymol and Chimera”’. Structural 
conservation figures were generated using ConSurf”!. 

Mass spectroscopy. Purified proteins were prepared for mass spectrometric 
analysis by in solution enzymatic digestion, without prior reduction and alkylation. 
Protein samples were digested with trypsin or elastase (Promega), both at an 
enzyme to protein ratio of 1:20. The resulting peptides were analysed by nano-scale 
capillary LC-MS/MS using an Ultimate U3000 HPLC (ThermoScientific Dionex) 
to deliver a flow of approximately 300 nl min~!. A C18 Acclaim PepMap100 541m, 


100m x 20mm nanoViper (ThermoScientific Dionex), trapped the peptides 
before separation on a C18 Acclaim PepMap100 31m, 75 zm x 250mm nanoViper 
(ThermoScientific Dionex). Peptides were eluted with a 90-min gradient of 
acetonitrile (2% to 50%). The analytical column outlet was directly interfaced via 
a nano-flow electrospray ionization source, with a hybrid quadrupole orbitrap mass 
spectrometer (Q-Exactive Plus Orbitrap, ThermoScientific). LC-MS/MS data were 
then searched against an in house LMB database using the Mascot search engine 
(Matrix Science)”, and the peptide identifications validated using the Scaffold 
program (Proteome Software Inc.)”*. All data were additionally interrogated 
manually. 

Sequence alignment. Sequence alignment was performed using Jalview”*. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | Biochemical characterization of recombinant 
APC/CM€ complex and preparations of wild-type and mutant 
complexes. a—e, SDS-PAGE gels (stained with Coomassie) of 
gel-filtration peak fraction from wild-type and mutant APC/CMC© 
complex preparations used in this study. Western blot against Strep tag 

in e confirms the presence of Apc154N'™ SP construct. f, Top, western 
blot performed with anti-Apc3 antibody to monitor the time-dependent 
phosphorylation of this subunit induced by okadaic acid (OA) treatment 
in the APC/C-expressing insect cells. Bottom, western blot against Apc6 
was used as a loading control and reflects the decrease in cell viability 
after addition of OA (data not shown). g, Western blot against His,-tagged 
ubiquitin of in vitro securin ubiquitination assays performed with either 
APC/C or APC/C in the presence or absence of Cdc20. h, The input 


sample for the ubiquitination assays performed in this study is shown. 

i, Western blot against securin of in vitro securin ubiquitination assays 
performed with APC/C® and Cdc20 with or without either MCC or 
miniMCC. j, SDS-PAGE of APC/CM reconstituted with MBP-TEV- 
Cdc204?°'© and untagged Cdc20M°°, MBP-TEV-Cdc204?°'© TEV 
cleavage products are indicated. k, SDS-PAGE gels of reconstituted 
APC/C4P¢15-MCC complex. I, Western blot against His.-tagged ubiquitin 
of in vitro securin ubiquitination assays performed with APC/C*“P"15 
and Cdc20 with or without MCC. m, Western blot against His-tagged 
ubiquitin of in vitro Cdc20 ubiquitination assays performed with 
APC/CM and increasing concentrations of UbcH10. Experiments in 

g, land m were replicated three times, and those in i were replicated four 
times. See Supplementary Fig. 1 for gel source data. 
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Extended Data Figure 2 | Stability of APC/C™°° complex, negative- 
stain EM reconstructions of APC/CM wild-type and mutant 
complexes and cryo-EM analysis. a, Top, chromatogram showing the 
elution profile of the APC/CM°° complex run on a Superose 6 column. 
Apo APC/C and thyroglobulin standard molecular mass marker (669 
kDa) are indicated. Bottom, SDS-PAGE of the eluted fractions. 
APC/CM€ elutes earlier than APC/C™. b, Negative-stain EM 
reconstructions performed for this study and EMD-1591 (ref. 31) 
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TPR lobe # 


APC/C™°S from mitotic 
HeLa cells (EMD-1591) 


— 


APC/Cminimee 


B-sheet (Cdc20*°°*) 


are shown. APC/C (grey) and MCC-Cdc204?“'© module (red) are 
highlighted. The APC/CMC and APC/CAPP4WHB-MCC reconstructions are 
also shown in the same orientation as in Fig. 4 to facilitate comparisons. 

c, A typical cryo-EM micrograph of APC/CMCClese¢ representative of 
20,234 micrographs. d, Gallery of two-dimensional class averages of 
APC/CMCC- dosed showing different views representative of 50 two- 
dimensional classes. e, Density quality for secondary structures. The 
APC/CMCC-closed map was filtered to 4.0 A. 
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Extended Data Figure 3 | Resolution and other cryo-EM features of maps. d, Close up of platform region (Apc4, Apc5 and Apc15). All the 
APC/C™C complexes. a—d, Fourier shell correlation (FSC) curves (a), maps shown in cand d are filtered to 8.5 A. Local resolution colour 
and local resolution maps calculated with RESMAP® (b-d) are shown scheme is indicated in the bar at the bottom of d. e, The APC/CUbH10-Mcc 
for all the cryo-EM reconstructions determined in this study. b, Ribbon reconstruction filtered at 12 A and shown at different threshold levels. 
representations of structures shown. c, Overall views of local resolution The lowest threshold is the same as in Fig. 5b. 
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Extended Data Figure 4 | Three-dimensional classification of APC/ oe 
a, b, 3D class averages obtained by classification with local searches 

(see Methods) are shown in a. Classes 1-5 (56%) show density for the 
MCC-Cdc204?'© module with classes 1-3 and 4-5 having a closed-like 
and an open-like conformation, respectively (framed in red, see Methods). 
Particles from classes 1-3 and 4-5 were separately refined and re-classified 
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using a mask (yellow, see Methods) shown in b to isolate the best quality 
particles for APC/ CMCC-closed (mask 1: left) and APC/CMC© ore" (mask 2: 
right). Percentages for each of the classes relative to the total number 

of selected particles are indicated. The percentages relative to the total 
number of APC/CM particles are shown in parentheses. 
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Extended Data Figure 5 | Features of the APC/CM structure. 
on Apc4#®D. f, Close up view of the Cdc20M°° CRY box recognition site 


a-c, Cryo-EM density and fitted coordinates for the Cdc204°/" IR 

tail (Cdc204PC/"®; a), the Cdc204P°/ NTD (Cdc204?C/"-N™P); b) and of Cdc204?°/©, The CRY box also contacts BubR1 in proximity to D1. 

C-Mad2 (c) are shown. Colours for each subunit are as in Fig. 1. Colours for each subunit are as in Fig. 1. g, Superposition of the Apc2“/#8 
domains from APC/CM€ and APC/CC4hE Hs!l-UbcH10-Ub structures and the 


d, Overall superposition of the APC/C°*°4*"! structure (red) with the 

APC/CMCC-cosed structure (green). The Cdc20V>” change of position is corresponding interacting regions of BubR17?® and UbcH10 are shown. 
illustrated, and the blades forming the D-box (yellow) binding-pocket are Bottom left, the residues mutated in BubR1™ that contact Apc2™"8 and 
highlighted. e, Superposition of Apc4, Apc5 and Apc15 between used in the ubiquitination assay shown in Fig. 5d are indicated (red). 
APC/CCAb-Hsll-UbcH10-Ub structure (grey) and APC/CMC (subunit colours as Bottom right, residues of UbcH10 (red) that contact the corresponding site 
in Fig. 1) shows the marked conformational change of Apc4!®P, Apc5N™P on Apc2"® ablate APC/C UbcH10-dependent ubiquitination activity’. 
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Extended Data Figure 6 | Conservation analysis on BubR1“!*? and 
BubR1??¥ regions. a, Similarities in modes of binding of BubR1 to two 
Cdc20 subunits of APC/CM (left) and Acm1 to two Cdh1 subunits in the 
Acm1-Cdh1 heterotrimer*® (right). D-box, KEN box, NEN box and ABBA 
motif are labelled as D, K, NEN and A, respectively. BubR1 (colour-ramped 
from blue to red indicating N to C terminus) mediates a Cdc20 dimer 
interface, whereas Acm1 mediates a Cdh1 dimer interface. b, Local sequence 
alignment performed with BubR14!© region sequences from several 
species (described on the left as: sequence identifier_protein name_species/ 
residue number) and the Saccharomyces cerevisiae Acm1***N region. 


rg'® 
— Cd c20/PC/c_wb40 


A D-box-like feature (corresponding to Emil?>* 7-10 positions)*? 
precedes the first ABBA motif (A1). A 21-33-residue long linker connects 
the Al to the second KEN-box (K2). Conserved positions are highlighted in 
orange. c, ConSurf analysis of the BubR1'?® region highlighting conserved 
residues on the Cdc204?'" binding pocket (left) and on the Apc2"8 
domain pocket (right). The Cdc20“?“'" binding pocket is required for a 
functional SAC*”. This site interacts with residues of BubR1 immediately 
N-terminal to KEN-2, thereby reinforcing their contacts with Cdc20°PC/”, 
Residue conservation is indicated in a gradient from cyan to purple. BubR1, 
Cdc20*P'© and Apc2"® are coloured as in Fig. 1. 
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by classification with local searches (see Methods) are shown for 
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APC/C4P¢15-MCC map was filtered to 4.8 A. c, 3D class averages obtained 
by classification using a mask (yellow, see Methods) are shown for 
APC/CUbH10-MCC, Class 1 was used for the final reconstruction. 


Percentages relative to the total amount of particles are indicated for each 
of the classes. 
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Extended Data Table 1 | Glossary of terms and abbreviations used in this study 


APC/C 

MCC 

SAC 

Cdc20 

Cdh1 

UbcH10 

Ube2S 
APC/C&a¢20 
APC/Codh" 
APC/cMcc 
APC/CMCC-closed 
APC/CMCC-open 
APC/CUbcH10-Mcc 
APC/CCdh1-Emit 
APC/CCah1 substrate.UbcH10-Ub 
APC/CAAPc15 
APC/CAApc15-MCC 
APC/CO& 
Cdc20Arcic 
Cdc2omec 
Cdc20APC/c-wo40 
C-Mad2 
O-Mad2 

IR tail 

C box 

KILR motif 

LR tail 

D box 

KEN box 

ABBA motif (A motif) 
CRY motif 

D1, D2 

A1,A2 

K1, K2 

Apc4H8o 
Apc5NT2 
Apc15NTH 
BubR17PR 


Anaphase-promoting complex/cyclosome (subunits: Apc1 to Apc8, Apc10 to Apc13, Apc15, Apc16) 
Mitotic checkpoint complex (subunits: Cdc20M°°, Mad2, BubR1, Bub3) 
Spindle assembly checkpoint 

Mitotic APC/C coactivator subunit 

Late mitotic APC/C coactivator subunit 

Initiating E2 

Elongating E2 

APC/C — Cdc20 complex 

APC/C — Cdh1 complex 

Inhibited APC/C&4¢20 — MCC complex 

Closed conformation of APC/CMCC 

Open conformation of APC/CMC° 

APC/CMCS — UbcH10 complex 

APC/C&4h1 — Emi1 complex 

APC/C“4"1 — substrate-UbcH10 complex 

APCI/C with Apc15 deleted 

APC/C™CE with Ape15 deleted 

Reconstituted phosphorylated APC/C isolated from BV/insect cells cultures incubated with okadaic acid 
Cdc20 subunit of APC/C&420 

Cdc20 subunit of MCC 

WD40 domain of Cdc20 subunit of APC/C°4"29 

Closed conformation of Mad2 

Open conformation of Mad2 

Apc3 and Apcé8 interacting C-terminal lle-Arg motif of Cdc20 (Cdh1 and Apc10) 
Apcé8 interacting motif of Cdc20 and Cdh1 (DRYIPxR) 

C-Mad2-interaction motif of Cdc20“C° and an Apc8-interacting motif of Cdc204PC/C 
Leu-Arg-Arg-Leu C-terminal motif common to Ube2S and Emi1 

APC/C degron: RxxLxx[IV]xN 

APCI/C degron: KEN 

APC/C degron: Fx[ILV][FHY]x[DE] 

APC/C degron: CRY 

D-1 box of BubR1, D-2 box of BubR1 

ABBA-1 motif of BubR1, ABBA-2 motif of BubR1 

KEN-1 box of BubR1, KEN-2 box of BubR1 

Helix bundle domain of Apc4 

N-terminal domain domain of Apc5 

N-terminal helix of Apc15 

BubR1 tetratricopeptide (TPR) domain 
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Extended Data Table 2 | Summary of cryo-EM data and statistics 
a. Statistics of all cryo-EM reconstructions 


Samples Micrographs collected Particles used for final reconstruction Resolution (A) 
APC/CMCCeised 155,263 4.2 
APC/CMCC-eren aoe 6,188 8.6 
APC/Cupaemce 2,340 8,491 8.9 
APC/C4?9ct5-Mce 5,054 163,308 4.8 


b. Statistics of all cryo-EM structure determination 
Data collection 


EM FEI Polara, 300K eV 
Detector FE] Falcon Ill 
Pixel size (A) 1.36 
Defocus range (um) 2.0-4.0 
Reconstruction APC/C™cc-closed APC/C™cc-oren APC/CYbcH10-Mcc APC/C A4pct5-mcc 
Software RELION 1.4 RELION 1.4 RELION 1.4 RELION 1.4 
Accuracy of rotation (degrees) 1.318 2.594 3.235 1.61 
Accuracy of translations (pixels) 0.843 1.763 2.09 1.08 
Final resolution (A) 4.2 8.6 8.9 4.8 
Refinement APC/CMcc-osed 
Software RefMac 5.8 
Refmac weight 0.04 
Resoluion limit (A) 4.0 
Residue number 9,014 
Average Fourier shell correlation 0.761 
R factor 0.2505 
Rms bind length (A) 0.0126 
Rms bond angle (°) 1.746 
Validation 
Ramachandran plot 
Preferred 8427 (93.49%) 
Allowed 353 (3.92%) 
Outliers 234 (2.60%) 
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Extended Data Table 3 | Summary of phosphorylation sites in APC/C complexes used in this study 
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The red shadowing shows the presence of phosphorylation sites 
and white indicates its absence 
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At a distance of 1.295 parsecs!, the red dwarf Proxima Centauri 
(& Centauri C, GL 551, HIP 70890 or simply Proxima) is the Sun’s 
closest stellar neighbour and one of the best-studied low-mass 
stars. It has an effective temperature of only around 3,050 kelvin, a 
luminosity of 0.15 per cent of that of the Sun, a measured radius of 
14 per cent of the radius of the Sun” and a mass of about 12 per cent 
of the mass of the Sun. Although Proxima is considered a moderately 
active star, its rotation period is about 83 days (ref. 3) and its 
quiescent activity levels and X-ray luminosity* are comparable 
to those of the Sun. Here we report observations that reveal the 
presence of a small planet with a minimum mass of about 1.3 Earth 
masses orbiting Proxima with a period of approximately 11.2 days 
at a semi-major-axis distance of around 0.05 astronomical units. Its 
equilibrium temperature is within the range where water could be 
liquid on its surface®. 

The results presented here consist of an analysis of previously 
obtained Doppler measurements (pre-2016 data) and the confirma- 
tion of a signal in a specifically designed follow-up campaign in 2016. 
The Doppler data come from two precision radial velocity instruments, 
both at the European Southern Observatory (ESO): the High Accuracy 
Radial velocity Planet Searcher (HARPS) and the Ultraviolet and Visual 
Echelle Spectrograph (UVES). HARPS is a high-resolution stabilized 
echelle spectrometer installed at the ESO 3.6m telescope (La Silla 
Observatory, Chile), the wavelength of which is calibrated using hollow 
cathode lamps (ThAr). HARPS has demonstrated radial velocity meas- 
urements at approximately 1 m s~' precision over timescales of years’, 
including measurements of low-mass stars’. All of the HARPS spectra 
were extracted and calibrated with the standard ESO Data Reduction 
Software and radial velocities were measured using a least-squares tem- 
plate matching technique’. HARPS data are separated into two data 
sets. The first set includes all of the data obtained before 2016 by several 
programmes (HARPS pre-2016). The second HARPS set comes from 
the more recent Pale Red Dot campaign (PRD hereafter), which was 
designed to eliminate period ambiguities using new HARPS observa- 
tions and quasi-simultaneous photometry. The HARPS PRD observa- 
tions consisted of one spectrum obtained almost every night between 
19 January and 31 March 2016. The UVES observations used the iodine 
cell technique® and were obtained in the framework of the UVES survey 
for terrestrial planets around M-class dwarfs between 2000 and 2008. 
The spectra were extracted using the standard procedures of the UVES 
survey” and new radial velocities were obtained using up-to-date iodine 


reduction codes!°. As systematic calibration errors produce correlations 
among the observations for each night!!, we consolidated the Doppler 
measurements through nightly averages to present a simpler and more 
conservative signal search. This led to 72 UVES, 90 HARPS pre-2016 
and 54 HARPS PRD epochs. The PRD photometric observations were 
obtained using the Astrograph for the South Hemisphere II telescope 
(ASH2 hereafter!”, with S m and Ha narrowband filters) and the Las 
Cumbres Observatory Global Telescope network! (with Johnson B and 
V band filters), over the same time interval and similar sampling rates 
as the HARPS PRD observations. Further details about each campaign 
and the photometry are detailed in Methods. All of the time series used 
in this work are available as Supplementary Data. 

The search and assessment of the statistical significance (see below 
and Methods for more details) of the signals were performed using 
frequentist’* and Bayesian’ methods. The periodograms in Fig. 1 
represent the improvement of a reference statistic as a function of trial 
period, with the peaks representing the most probable new signals. The 
improvement in the logarithm of the likelihood function AlnL is the 
reference statistic used in the frequentist framework, and its value is 
then used to assess the false-alarm probability (FAP) of the detection". 
An FAP below 1% is considered suggestive of periodic variability, and 
anything below 0.1% is considered to be a significant detection. In the 
Bayesian framework, the signals are first searched using a specialized 
sampling method!* that enables the exploration of multiple local max- 
ima of the posterior density (the result of this process is the red lines in 
Fig. 1), and the significance of the signals is then assessed by obtaining 
the ratios of evidences of the models. In a Bayesian context, the evi- 
dence B,, of a model m is the integral of its Bayesian posterior density. 
A more detailed description and references are provided in Methods. 
If the evidence ratio between two models exceeds some threshold (for 
example, B,/Bo > 10°), then the model in the numerator (with one 
planet) is favoured against the model in the denominator (no planet). 

An isolated peak at about 11.2 d was recovered when all of the night 
averages in the pre-2016 data sets were averaged (Fig. 1a). Despite the 
significance of the signal, the analysis of the pre-2016 subsets produced 
slightly different periods depending on the noise assumptions and 
which of the subsets were considered. Confirmation or refutation of 
this signal at 11.2 d was the main driver for the proposal of the HARPS 
PRD campaign. The analysis of the HARPS PRD data revealed a single 
significant signal at approximately the same 11.3 +0.1 d period (Fig. 1b), 
but this period coincidence alone does not prove consistency with 
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Figure 1 | Detection of a Doppler signal at 11.2 d. a, b, Detection 
periodograms of the 11.2 d signal in the HARPS+UVES pre-2016 data 
(a) and the HARPS PRD campaign only (b). c, The periodogram obtained 
after combining all of the data sets. Black lines correspond to the A 


the pre-2016 data. Final confirmation is achieved when all of the data 
sets are combined (Fig. 1c)—the statistical significance of the signal 
at 11.2 d then increases dramatically (FAP < 10~’, Bayesian evidence 
ratio B,/Bo > 10°). This implies that not only the period, but also the 
amplitude and phase are consistent during the 16 years of accumu- 
lated observations (see Fig. 2). All of the analyses performed with and 
without correlated-noise models produced consistent results. A second 
signal in the range of 60-500 d was also detected, but its nature is still 
unclear due to stellar activity and inadequate sampling. 
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Figure 2 | All of the data sets phase-folded at the 11.2 d signal. Radial 
velocity measurements phase folded at the 11.2 d period of the planet 
candidate for 16 years of observations. The abscissa values of phase-folded 
plots are determined by first computing the difference between each Julian 
date and the reference epoch of the Keplerian fit (Julian date of the first 
UVES observation), and then computing the remainder of the division 

of this difference with the orbital period. Although its nature is unclear, 

a second signal at P~ 200 d was fitted and subtracted from the data to 
produce this plot and improve visualization. Circles correspond to HARPS 
PRD, triangles are HARPS pre-2016 and squares are UVES. The black line 
represents the best Keplerian fit to this phase folded representation of the 
data. Error bars correspond to formal 1c uncertainties. 
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InL statistic, whereas the grey thick lines represent the logarithm of the 
Bayesian posterior density (see text, arbitrary vertical offset applied for 
visual comparison of the two statistics). The horizontal solid, dashed and 
dotted lines represent the FAP thresholds of the frequentist analysis. 


Stellar variability can cause spurious Doppler signals that mimic 
planetary candidates, especially when combined with uneven 
sampling”!”. To address this, the time series of the photometry and 
spectroscopic activity indices were also searched for signals. After 
removing occasional flares, all four photometric time series show 
the same clear modulation with a period of P ~ 80 days (Fig. 3b-e), 
which is consistent with the previously reported photometric period of 
approximately 83 d (ref. 3). Spectroscopic activity indices were meas- 
ured on all of the HARPS spectra, and their time series were also inves- 
tigated. The width of the spectral lines (measured as the variance of the 
mean line, or 12) follows a time dependence that is almost identical to 
the light curves, a behaviour that has already been reported for other 
M dwarf stars!®. The time series of the indices that are based on chro- 
mospheric emission lines (for example, Ha) do not show evidence of 
periodic variability, even after removing the data points that are likely 
to be affected by flares. We also investigated possible correlations of 
the Doppler measurements with the activity indices by including linear 
correlation terms in the Bayesian model of the Doppler data. Although 
some indices do show hints of correlation in some campaigns, includ- 
ing them in the model produces lower probabilities, owing to overpa- 
rameterization. Flares have very little effect on our Doppler velocities, 
as has already been suggested by previous observations of Proxima”. 
More details are provided in Methods and Extended Data Fig. 8. As the 
analysis of the activity data failed to identify any stellar activity feature 
that is likely to generate a spurious Doppler signal at 11.2 d, we con- 
clude that the variability in the data is best explained by the presence 
of a planet (Proxima b, hereafter) orbiting the star. All of the available 
photometric light curves were searched for evidence of transits, but no 
obvious transit-like features were detectable in our light curves. We used 
optimal box-Least-Squares codes”® to search for candidate signals in 
data from the All Sky Automatic Survey’. No significant transit signal 
was found down to a depth of about 5%. The most likely orbital solution 
and the putative properties of the planet and transits are given in Table 1. 

The Doppler semi-amplitude of Proxima b (approximately 1.4m s~!) 
is not particularly small compared with other reported planet candi- 
dates®. The uneven and sparse sampling combined with the longer- 
term variability of the star seem to be the reasons why the signal could 
not be unambiguously confirmed with pre-2016 data rather than the 
total amount of data accumulated. The corresponding minimum plan- 
etary mass is about 1.3 Earth masses (Mj). With a semi-major axis of 
approximately 0.05 au, it lies squarely in the centre of the classical hab- 
itable zone for Proxima®. As mentioned earlier, the presence of another 
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a FE T T T T 3 Table 1 | Stellar properties, Keplerian parameters, and derived 
6.00 |— quantities 
cae HARPS PRD RV 4 : 
= : 4 Stellar properties Value Reference 
—~ 2.00 
2% 000 Spectral type M5.5V 2 
SE cae M:/Me 0.120 (0.105-0.135) 30 
8 aa: Ru/Ro 0.141 (0.120-0.162) 2 
mele + 46 Li/Lo 0.00155 (0.00149-0.00161) 2 
b ™. fp Effective temperature (K) 3,050 (2,950-3,150) 2 
g 1.15 i ASH2Si 7T Rotation period (d) about 83 3 
= 110- , 4 Habitable zone range (AU) about 0.0423-0.0816 30 
xe} : 4 - % 
g 1.05 Ey i i” ; wut! vil iH Habitable zone periods (d) about 9.1-24.5 30 
By r aes TH| i: ji" | 
E | Coe Ls | Lee 7 
5 1.00 L : : il ij ii sluuit tH | Keplerian fit Proxima b 
2 o95- + Period (d) 11.186 (11.184-11.187) 
c — {1 Doppler amplitude (m s~) 1.38 (1.17-1.59) 
7 1.15 pr. : ASH2Ha 7 Eccentricity, e <0.35 
= i10-b i- : = Mean longitude, \ = w + Mo (°) 110 (102-118) 
1.10 { 
xe} - i re . ji 4 : fe = 
g 1.05 Pp a ‘ 1 is 4 Argument of periastron, wo (°) 310 (0-360) 
% iti; i a a ae i 
E 1.00 - i 7 | i iii bid id i tj iii! yutt t | Ih Statistics summary 
Zz oa iy itt jiylill : : oe 
0.95 ye 4 Frequentist FAP 7x10 
d SS ae a Bayesian odds in favour, B1/Bo 2.1 x 107 
aoe neaeey | UVES jitter (ms~}) 1.69 (1.22-2.33) 
2 110 _| HARPS pre-2016 jitter (ms) 1.76 (1.22-2.36) 
ee = y : _] HARPS PRD jitter (m s~2) 1.14 (0.57-1.84) 
=| L 5° . : a 
oe: °, 1 : 
E 1.00 -— TN ao Pp iats WR ¥ qT Derived quantities 
= 095+ a 4 Orbital semi-major axis, a (AU) 0.0485 (0.0434-0.0526) 
e +f +} | _;___} 4 Minimum mass, mpsini (M.) 1.27 (1.10-1.46) 
145- : | Equilibrium black body 234 (220-240) 
3 +406 LCOGTB. JT temperature (K) 
3 E : 4 Irradiance compared with Earth 65% 
5 1.05;- KO, : x =| Geometric probability of transit about 1.5% 
E 1.00;- ret oy . 7 Transit depth (Earth-like density) about 0.5% 
2 0.95 FE =: : = Cf = oe 3 am The estimates are the maximum a posteriori values and the uncertainties of the parameters are 
cr | | : | ; | 4 expressed as 68% credibility intervals. We provide only an upper limit for the eccentricity 
f 10.4 | | t | T | (95% confidence level). Extended Data Table 1 contains the list of all of the model parameters. 
ok HARPS PRD m 4 
ee 10.2 2 4 The habitability of planets like Proxima b—in the sense of sustaining 
x fale r) af ty | anatmosphere and liquid water on its surface—is a matter of intense 
Ge oo Fil wt tae e 44 | debate. The most common arguments against habitability are tidal lock- 
2 9.8; } 4 ty? “be r) ing, strong stellar magnetic fields, strong flares and high ultraviolet 
96 r # +f ¢ ay _| and X-ray fluxes; but none of these have been proved definitive. Tidal 
g@ sac | } } H | — locking does not preclude a stable atmosphere via global atmospheric 
ee HARPS PRD m, | circulation and heat redistribution”!. The average global magnetic flux 
e 2.00;- . , + density of Proxima is 600 + 150 G (ref. 22), which is quite large com- 
a r nt ; > pared with that of the Sun (1 G). However, several studies have shown 
& 160) ’ yt, ur iH 4 a it "| _ that planetary magnetic fields in tidally locked planets can be strong 
£ 120K 7 7 ; tH + te tt _| enough to prevent atmospheric erosion by stellar magnetic fields” 
L + ’ ‘ $ 4 and flares**. Because of its close orbit to Proxima, Proxima b suffers 
0.80 [— — from X-ray fluxes that are approximately 400 times that experienced 


2457400 2457420 2457440 2457460 


Julian Date (d) 


2457480 


Figure 3 | Time series obtained during the PRD campaign. a, HARPS- 
PRD radial velocity measurements. b, c, Quasi-simultaneous photometry 
from ASH2 for S 11 (b) and Ha (c). d, e, Quasi-simultaneous photometry 
from LCOGT for V (d) and B (e). f, g, Central moments of the mean line 
profiles for mz (f) and ms; (g). The solid lines show the best fits. A dashed 
line indicates a signal that is not sufficiently statistically significant. 
Excluded measurements that probably affected activity events (for 
example, flares) are marked with grey arrows. The photometric time series 
and m2 show evidence of the same approximately 80 d modulation. Error 
bars correspond to formal 1o uncertainties. 


super-Mz planet cannot yet be ruled out at longer orbital periods and 
Doppler semi-amplitudes of <3 m s~!. By numerical integration of 
some putative orbits, we verified that the presence of such a planet 
would not compromise the orbital stability of Proxima b. 


by Earth, but studies of similar systems indicate that atmospheric losses 
can be relatively small?°. Further characterization of such planets can 
also inform us about the origin and evolution of terrestrial planets. 
For example, the formation of Proxima b from in situ disk material is 
implausible because disk models for small stars would contain less than 
1Mz of solids within a distance of 1 au. There are three possibilities: the 
planet migrated in via type I migration”®; planetary embryos migrated 
in and coalesced at the current planet’s orbit; or pebbles/small plane- 
tesimals migrated via aerodynamic drag” and later coagulated into a 
larger body. Although migrated planets and embryos that originate 
beyond the ice-line would be rich in volatiles, pebble migration would 
produce much drier worlds. A warm terrestrial planet orbiting Proxima 
offers the opportunity to attempt further characterization via transits 
(ongoing searches), by direct imaging and high-resolution spectroscopy 
in the next decades’, and possibly robotic exploration in the coming 
centuries”’. 
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Statistical frameworks and tools. The analyses of time series including the radial 
velocities and activity indices were performed by frequentist and Bayesian meth- 
ods. In all cases, the statistical significance was assessed using model comparisons 
by performing global multiparametric fits to the data. Here we provide a minimal 
overview of the methods and assumptions used throughout the Letter. 

Bayesian statistical analyses. The analyses of the radial velocity data were per- 
formed by applying posterior sampling algorithms called Markov chain Monte 
Carlo methods. We used the adaptive Metropolis algorithm*?, which has previously 
been applied to such radial velocity data sets'***. This algorithm is simply a gener- 
alized version of the common Metropolis—Hastings algorithm**"4 that adapts to 
the posterior density based on the previous members of the chain. 

The likelihood functions and posterior densities of models with periodic sig- 
nals are highly multimodal (that is, they have peaks in the periodograms). For 
this reason, in our Bayesian signal searches we applied the delayed rejection adap- 
tive Metropolis (DRAM) method", which enables efficient jumping of the chain 
between multiple modes by postponing the rejection of a proposed parameter vector 
by first attempting to find a better value in its vicinity. For every given model we 
performed several posterior samplings with different initial values to ensure con- 
vergence to a unique solution. When we identified two or more substantial maxima 
in the posterior density, we typically performed several additional samplings with 
initial states close to those maxima. This enabled us to evaluate their relative impor- 
tance in a consistent manner. We estimated the marginal likelihoods and the corre- 
sponding Bayesian evidence ratios of different models by using a simple method”. 
A more detailed description of these methods can be found in elsewhere”. 
Doppler model and likelihood function. Assuming that the ith radial velocity meas- 
urements is m; ns obtained at some instant ¢; from an instrument INS, the likeli- 
hood function of the observations (probability of the data given a model) is given by 


L=[][[]/iws (1) 


INS i 
2 
1 1G 
Tins = SS OP 3 BINS (2) 
\[20(0; + oINs) 20; + Fins 
INS = Mins — {Ynys + YAtit K(At;) + MAj Ins + Aitns} (3) 
Atj= t; — to (4) 


where fo is some reference epoch. This reference epoch can be arbitrarily chosen, 
often as the beginning of the time series or a mid-point of the observing campaigns. 
The other terms are: 

(1) ei1ng are the residuals to a fit. We assume that each €;;ns value is a Gaussian 


random variable with a zero mean and a variance of a + Tans where a; is the 
reported uncertainty of the ith measurement and is the jitter parameter and 


Tins represents the excess white noise not included in ree 


(2) yang is the zero-point velocity of each instrument. Each INS can have a dif- 
ferent zero point depending on how the radial velocities are measured and how 
the wavelengths are calibrated. 

(3) ¥ is a linear trend parameter caused by a long-term acceleration. 

(4) The term «(At;)) is the superposition of k Keplerian signals evaluated at 
At;. Each k depends on five parameters: the orbital period P,, the semi-amplitude 
of the signal K,, the mean anomaly Mop, which represents the phase of the orbit 
with respect to the periastron of the orbit at fo, the orbital eccentricity e, that 
goes from 0 (circular orbit) to 1 (unbound parabolic orbit) and the argument of 
periastron w,, which is the angle on the orbital plane with respect to the plane 
of the sky at which the star goes through the periastron of its orbit (the planet's 
periastron is at wp, + 180°). Detailed definitions of the parameters can be found 
elsewhere*”. 

(5) The moving average term 


t1—-tj 
He as (5) 
NS 


MAjIns= Syst 
Ti 


is a simple parameterization of the possible correlated noise that depends on 
the residual of the previous measurement €;_1:ns. As for the other parameters 
related to noise in our model, we assume that the parameters of the MA function 
depend on the instrument; for example the different wavelength ranges used will 
cause different properties of the instrumental systematic noise. Keplerian and 
other physical processes also introduce correlations into the data, therefore some 
degree of degeneracy between the MA terms and the signals of interest is expected. 
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Asa result, including an MA term always produces more conservative statistical 
significance estimates than a model with uncorrelated random noise only. The 
MA model is implemented through a coefficient dns and a timescale Tins. Pins 
quantifies the strength of the correlation between the i and i — 1 measurements. 
It is bound between —1 and 1 to guarantee that the process is stationary (that is, 
the contribution of the MA term does not arbitrarily grow over time). Exponential 
smoothing is used to decrease the strength of the correlation exponentially as the 
difference t; — t;_, increases*®. 

(6) Linear correlations with activity indices can also be included in the model 


in the following manner 


Ajins= >, Cerns &1ns (6) 
g 


where € runs over all of the activity indices used to model each INS data set (for 
example, m, m3, S-index and so on, whose descriptions are provided below). To 
avoid any confusion with other discussions about correlations, we call these Cg ins 
activity coefficients. Note that each activity coefficient C¢ ns is associated with one 
activity index (€;) obtained simultaneously with the ith radial velocity measurement 
(for example, chromospheric emission from the Ha line, the second moment of 
the mean-line profile, the interpolated photometric flux and so on). When fitting 
a model to the data, an activity coefficient substantially different from 0 indicates 
evidence of Doppler variability correlated with the corresponding activity index. 
Formally speaking, these Ce js correspond to the coefficient of the first-order 
Taylor expansion of a physical model for the apparent radial velocities as a function 
of the activity indices and other physical properties of the star. 

A simplified version of the same likelihood model is used when analysing time 
series of activity indices. That is, when searching for periodicities in series other 
than Doppler measurements, the model will consist of the ys zero points, a linear 
trend term yAt; and a sum of n sinusoids 


n . 
R(t; 0) = > Asin + Bycos an Ati 
k k Px 


2nAt; 


(7) 


where each kth sinusoid has three parameters A,, By, and P; instead of the five 
Keplerian ones. Except for the period parameters and the jitter terms, this model 
is linear with all the other parameters, which allows a relatively quick computation 
of the likelihood-ratio periodograms. 

Bayesian prior choices. As in any Bayesian analysis, the prior densities of the model 
parameters have to be selected in a suitable manner (for example see ref. 39). We 
used uniform and uninformative distributions for most of the parameters apart 
from a few, possibly important, exceptions. First, as we used a parameter | = InP 
in the Markov chain Monte Carlo samplings instead of the period P directly, the 
uniform prior density 7(J) = c for all] € [InPo, InPmax], where Po and Pmax are some 
minimum and maximum periods, does not correspond to a uniform prior in P. 
Instead, this prior corresponds to a period prior such that 1(P) x P7! (ref. 40). 
We made this choice because the period can be considered a scale parameter for 
which an uninformative prior is one that is uniform in InP (ref. 41). We selected 
the parameter space of the period such that Pp = 1 d and Pyax = Tops, where Typs is 
the time baseline of the combined data sets. 

For the semi-amplitude parameter K, we used a 7(K) =c for all K € [0, Kmax], 
where Kmax was selected as 10m s~! because the r.m.s. of the Doppler series did 
not exceed 3ms ' in any of the sets. Following previous works“, we chose the 
prior for the orbital eccentricities as +(e) oc N(0, 3); where e is bound between 
zero (circular orbit) and 1. We set this 5? = 0.3 to penalize high eccentricities while 
keeping the option of high e if the data strongly favours it. 

We also used an informative prior for the excess white noise parameter of oy 
for each instrument. Based on analyses of a sample of M dwarfs", this stellar jitter 
is typically very close to a value of 1m s_!. Thus, we used a prior such that 
mol) x Mt,s o) the parameters were selected as ji, = 0,=1ms~!. Uniform 
priors were used in all the activity coefficients Ce € [—Cemax» Cemax]. For practical 
purposes, the time series of all activity indices were mean subtracted and normal- 
ized to their r.m.s. This choice allows us to select the bounds of the activity coeffi- 
cients for the renormalized time series as Ceti =3ms_], so that adding 
correlation terms does not dramatically increase the r.m.s. of the Doppler time 
series over the initially measured r.m.s. of <3ms_! (same argument as for the prior 
on K). This renormalization is automatically applied by our codes at initialization. 
Search for periodicities and statistically significant signals in a frequentist framework. 
Periodograms are plots that represent a figure of merit derived from a fit against 
the period of a newly proposed signal. In the case of unevenly sampled data, a 
very popular periodogram is the Lomb-Scargle periodogram** and its variants, 
such as the Floating-mean periodogram’ or the F-ratio periodogram“®. In this 
work we use likelihood ratio periodograms, which represent the improvement of 
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the likelihood statistic when adding a new sinusoidal signal to the model. Owing 
to intrinsic nonlinearities in the Keplerian/radial velocity modelling, optimizing 
the likelihood statistic is more computationally intensive than the classic Lomb- 
Scargle-like periodograms**“”. On the other hand, the likelihood function is a more 
general and well behaved statistic that, for example, allows for the optimization of 
the noise parameters (such as jitter and the fit correlated noise models at the signal 
search level). By well-behaved we mean that it has less intrinsic variance compared 
with other statistics that do not include parameters for the noise such as the \? 
statistic. Once the maximum likelihood of a model with one additional planet is 
found (the highest peak in the periodogram), its FAP can then be easily computed!***. 
In general, an FAP of 1% is needed to claim hints of variability, and a value below 
0.1% is considered necessary to claim a statistically significant detection. 

Spectroscopic data sets. New reduction of the UVES M-dwarf programme data. 
Between 2000 and 2008, Proxima was observed in the framework of a precision 
radial velocity survey of M dwarfs in search for extrasolar planets with UVES 
installed in the Very Large Telescope unit 2. To attain high-precision radial velocity 
measurements, UVES was self-calibrated with its iodine gas absorption cell operated 
at a temperature of 70°C. Image slicer number 3 was chosen, which redistributes the 
light from a 1” x 1” aperture along the chosen 0.3”-wide slit. In this way, a resolving 
power of R= 100,000-120,000 was attained. At the selected central wavelength of 
600 nm, the useful spectral range containing iodine (I) absorption lines (500— 
600 nm) falls entirely on the better-quality detector of the mosaic of two 4K x 2K 
CCDs. More details can be found in the several papers from the UVES survey”. 

The extracted UVES spectra include 241 observations taken through the iodine 
cell, three template (no iodine) shots of Proxima and three spectra of the rapidly 
rotating B star HR 5987 that are also taken through the iodine cell and almost 
consecutive to the three template shots. The B star has a smooth spectrum devoid 
of spectral features and it was used to calibrate the three template observations 
of the target. Ten of the iodine observations of Proxima were eliminated due to 
low exposure levels. The remaining 231 iodine shots of Proxima were taken on 
77 nights, typically 3 consecutive shots per night. 

The first steps in the processing of the I,-calibrated data consists of constructing 
the high signal-to-noise template spectrum of the star without iodine: (1) a custom 
model of the UVES instrumental profile is generated on the basis of the observa- 
tions of the B star by forward modelling the observations using a higher-resolution 
(R=700,000-1,000,000) template spectrum of the I, cell obtained with the McMath 
Fourier Transform Spectrometer (FTS) on Kitt Peak; (2) the three template 
observations of Proxima are then co-added and filtered for outliers; and (3) on 
the basis of the instrument profile model and wavelength solution derived from 
the three B star observations, the template is deconvolved with our standard 
software!”. After the creation of the stellar template, the 231 iodine observations 
of Proxima were then run through our standard precision velocity code®. The 


resulting standard deviation of the 231 unbinned observations is 2.58 m sok 


and the standard deviation of the 77 nightly binned observations is 2.30 m sh 
which already suggests an improvement compared to the 3.11ms_! reported in 
the original UVES survey reports”. All of the UVES spectra (raw) are publicly 
available in their reduced form via the ESO’s archive at http://archive.eso.org/cms. 
html. Extracted spectra are not produced for this mode of UVES operation, but 
they are available upon request from the corresponding author. 

HARPS GTO. The initial HARPS-Guaranteed Time Observations programme was 
led by M. Mayor (ESO ID: 072.C-0488). 19 spectra were obtained between May 
2005 and July 2008. The typical integration time ranges between 450 s and 900s. 
HARPS M-dwarfs. Led by X. Bonfils and collaborators, this programme consists 
of ESO programmes 082.C-0718 and 183.C-0437. It produced 8 and 46 measure- 
ments, respectively, with integration times of 900s in almost all cases*”, 

HARPS high-cadence. This programme consisted of two 10-night runs (May 2013, 
and December 2013, ESO ID: 191.C-0505) and was led and executed by several 
co-authors of this work. Proxima was observed on two runs: 

(1) May 2013: 143 spectra obtained on three consecutive nights between 4 May 
and 7 May and 25 additional spectra between 7 May and 16 May with exposure 
times of 900s. 

(2) December 2013: 23 spectra obtained between 30 December and 10 January 
2014 also with 900s exposure times. 

For simplicity in the presentation of the data and analyses, all HARPS data 
obtained before 2016 (HARPS GTO, HARPS M-dwarfs, and HARPS high-cadence) 
are integrated in the HARPS pre-2016 set. The long-term Doppler variability and 
sparse sampling makes the detection of the Doppler signal more challenging in 
such a consolidated set than, for example, separating it into subsets of contiguous 
nights. The latter strategy, however, necessarily requires more parameters (offsets, 
jitter terms, correlated noise parameters) and arbitrary choices on the sets to be 
used, producing strong degeneracies and aliasing ambiguities in the determination 
of the favoured solution (11.2 d was typically favoured, but alternative periods 
caused by a non-trivial window function at 13.6 d and 18.3 d were also found to 


be possible). The data taken in 2016 exclusively correspond to the new campaign 
specifically designed to address the sampling issues. 
HARPS PRD campaign. The PRD campaign was executed between 18 January and 
30 March 2016. Interruptions of a few nights were anticipated to allow for technical 
work and other time-critical observations with HARPS. Of the 60 scheduled 
epochs, we obtained 56 spectra for 54 nights (two spectra were obtained on two of 
those nights). Integration times were set to 1,200s, and observations were always 
obtained at the very end of each night. All of the HARPS spectra (raw, extracted 
and calibrated frames) are publicly available in their reduced form via ESO’s archive 
at http://archive.eso.org/cms.html. 
Spectroscopic indices. Stellar activity can be traced by features in the stellar spec- 
trum. For example, changes in the line-profile shapes (symmetry and width) have 
been associated with spurious Doppler shifts'*°!. Chromospheric emission lines 
are tracers of spurious Doppler variability in the Sun and they are expected to 
behave similarly for other stars™”. We describe here the indices measured and used 
in our analyses. 
Measurements of the mean spectral line profiles. The HARPS Data Reduction 
Software provides two measurements of the mean-line profile shapes derived from 
the cross-correlation function (CCF) of the stellar spectrum with a binary mask. 
These are called the bisector span (or BIS) and full-width-at-half-maximum (or 
FWHM) of the CCF”. For very-late-type stars like Proxima, all of the spectral lines 
are blended, producing a non-trivial shape of the CCF and thus the interpretation 
of the usual line-shape measurements is not nearly as reliable as in earlier-type 
stars. We applied the least-squares deconvolution (LSD) technique*? to obtain a 
more accurate estimate of the spectral mean line profile. This profile is generated 
from the convolution ofa kernel, which is a model spectrum of line positions and 
intensities, with the observed spectrum. A description of our implementation of 
the procedure, applied specifically to crowded M dwarf spectra is described in 
ref, 54. The LSD profile can be interpreted as a probability function distribution 
that can then be characterized by its central moments*. We computed the second 
(m2) and third (m3) central moments of each LSD profile for each observation. 
More details of these indices and how they compare with other standard HARPS 
cross-correlation measurements can be found in ref. 11. To eliminate the correla- 
tion of the profile moments with the slope of the spectral energy distribution!’ we 
corrected the spectral energy distribution and blaze function to match the same 
spectral energy distribution of the highest signal-to-noise ratio (or S/N) observa- 
tion obtained with HARPS. Uncertainties were obtained using an empirical 
procedure as follows: we derived all the m2 and m3 measurements of the high- 
cadence night of 7 May 2013 and fitted a polynomial to each time series. The 
standard deviation of the residuals to that fit was then assumed to be the expected 
uncertainty fora S/N 20 (at reference echelle aperture number 60), which was 
the typical value for that night’s observations. All other errors were then obtained 
20 _ for each observation. 
S/Nobs 
Chromospheric indices. Chromospheric emission lines are tracers of spurious 
Doppler variability in the Sun and they are expected to behave similarly for other 
stars°*. We describe here the indices computed and used in our analyses. 
Chromospheric Ca 1 H+-K S-index. We calculated the Ca 11 H+K fluxes following 
standard procedures**°’, both the PRD data and the pre-2016 data were treated 
the same. Uncertainties were calculated from the quadrature sum of the variance 
in the data used within each bandpass. 
Chromospheric Ha emission. This index was measured in a similar way to the 
S-indices, in that we summed the fluxes in the centre of the lines, calculated to be 
6,562.808 A, this time using square bandpasses of 0.678 A not triangular shapes, 
and those were normalized to the summed fluxes of two square continuum band 
regions surrounding each of the lines in the time series. The continuum square 
bandpasses were centred at 6,550.870 A and 6,580.309 A and had widths of 10.75A 
and 8.75 A, respectively. Again the uncertainties were calculated from the quadra- 
ture sum of the variance of the data within the bandpasses. 
Photometric data sets. ASH2. The ASH2 telescope is a 40 cm robotic telescope with 
a CCD camera STL11000 2.7 K x 4K, anda field-of-view (FOV) of 54 x 82 arcmin. 
Observations were obtained in two narrow-band filters centred on the Ha and S 1 
lines, respectively (Ha is centred on 656 nm, S 11 is centred on 672 nm, and both 
filters have a Gaussian-like transmission with a FWHM of 12nm). The telescope 
is at SPACEOBS (San Pedro de Atacama Celestial Explorations Observatory), at 
2,450 m above sea level, located in the northern Atacama Desert in Chile. This 
telescope is managed and supported by the Instituto de Astrofisica de Andalucia 
(Spain). During the present work, only subframes with 40% of the total FOV were 
used, resulting in a useful FOV of 21.6 x 32.8 arcmin. Approximately 20 images in 
each band of 100s of exposure time were obtained per night. In total, 66 epochs of 
about 100 min each were obtained during this campaign. The number of images 
collected per night was increased during the second part of the campaign (to about 
40 images in each filter per night). 


by scaling this standard deviation by a factor of 
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All CCD measurements were obtained by the method of synthetic aperture 
photometry using a 2 x 2 binning. Each CCD frame was corrected in a standard 
way for dark and flat-fielding. Different aperture sizes were also tested to choose 
the best one for our observations. A number of nearby and relatively bright stars 
within the frames were selected as check stars to choose the best ones to be used as 
comparison stars. After checking their stability, C2 = HD 126625 and C8 =TYC 
9010-3029-1, were selected as main comparison stars. 

The basic photometric data were computed as the differences in magnitude 

of the S 1 and Ha filters for Var-X and C2-X, with Var = Proxima and X = C2+ 
C8)/2. Typical uncertainties of each individual data point are about 6.0 mmag, for 
both S 1 and Ha filters. This usually leads to error bars of about 1.3 mmag in the 
determination of the mean levels of each epoch, assuming 20 points per night once 
occasional strong activity episodes (such as flares) are removed for the analysis of 
periodicities. For the analyses, these magnitudes were transformed to relative flux 
measurements normalized to the mean flux over the campaign. 
LCOGT network. The LCOGT is an organization dedicated to time-domain 
astronomy’’. To facilitate this, LCOGT operates a homogeneous network of 1m 
and 2m telescopes on multiple sites around the world. The telescopes are controlled 
by a single robotic scheduler, which is capable of orchestrating complex responsive 
observing programmes, using the entire network to provide uninterrupted obser- 
vations of any astronomical target of interest. Each site hosts between one and three 
telescopes, which are configured for imaging and spectroscopy. The telescopes are 
equipped with identical instruments and filters, which allows for network redun- 
dancy. This means that observations can be seamlessly shifted to alternate sites at 
any time if the scientific programme requires it, or in the event of poor weather. 

Observations for the PRD campaign were obtained on the 1 m network every 
24h in the B and V bands with the Sinistro (4K x 4K Fairchild CCD486) cam- 
eras, which have a pixel scale of 0.38 arcsec and a FOV of 27 x 27 arcminutes. In 
addition, B and V observations were taken every 12h with the SBIG (4K x 4K 
Kodak KAF-6303E CCD) cameras, with a pixel scale of 0.46 arcsec and a FOV of 
16 x 16 arcminutes. Exposure times ranged between 15 and 40s and a total of 488 
photometrically useful images were obtained during the campaign. 

The photometric measurements were performed using aperture photometry 
with Astrolmage]** and DEFOT™. The aperture sizes were optimized during the 
analysis with the aim of minimizing the measurement noise. Proxima and two 
non-variable comparison stars were identified in a reference image and used to 
construct the detrended light curves. As with the ASH2 curves, the LCOGT differ- 
ential magnitudes were transformed to normalized flux to facilitate interpretation 
and later analyses (see Fig. 3). 

Signals in time series. In this section we present a homogeneous analysis of all of 
the time series (Doppler, activity and photometric) presented in this Letter. In all 
of the periodograms, the black curve represents the search for a first signal. If one 
first signal is identified, then a red curve represents the search for a second signal. 
In the few cases where a second signal is detected, a blue curve represents the search 
for a third signal. The period of Proxima b is marked with a green vertical line. 
Module of the window function. We first present the so-called window function of 
the three sets under discussion. The window function is the Fourier transform of 
the sampling®. Its module shows the frequencies (or periods) where a signal with 
0 frequency (or infinite period) would have its aliases. As shown in Extended Data 
Fig. 1, both the UVES and HARPS PRD campaigns have a relatively clear window 
function between 1 and 360 d, meaning that the peaks in periodograms can be 
interpreted in a very straightforward way (no aliasing ambiguities). For the UVES 
case, this happens because the measurements were uniformly spread over several 
years without severe clustering, producing only strong aliases at frequencies beating 
caused by the usual daily and yearly sampling (peaks at 360, 1, 0.5 and 0.33 d). The 
window of the PRD campaign is simpler, which is the result of a shorter time span 
and the uniform sampling of the campaign. On the other hand, the HARPS pre- 
2016 window function (Extended Data Fig. 1b) contains numerous peaks between 
1 and 360 d. This means that signals (for example, activity) in the range of a few 
hundred days will inject severe interference in the period domain of interest, and 
explains why this set is where the Doppler signal at 11.2 d is detected with a lower 
confidence (see Extended Data Fig. 2). 

Radial velocities. Here we present likelihood-ratio periodogram searches for signals 
in the three Doppler time series separately (PRD, HARPS pre-2016 and UVES). 
They are analysed in the same way as the activity indices to enable direct visual 
comparison. They differ from the ones presented in the main Letter in the sense 
that they do not include MA terms and the signals are modelled as pure sinusoids 
to mirror the analysis of the other time series as close as possible. The resulting 
periodograms are shown in Extended Data Fig. 2. A signal at 11.2 d was close to 
detection using UVES data only. However, let us note that the signal was not clearly 
detectable using the Doppler measurements as provided by the UVES survey*®, and 
it only became obvious when new Doppler measurements were re-derived using 
up-to-date iodine codes (see Methods subsection ‘New reduction of the UVES 
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M-dwarf programme data’). The signal is weaker in the HARPS pre-2016 data 
set, but it still appears as a possible second signal after modelling the longer-term 
variability with a Keplerian at 200 d. Subsets of the HARPS pre-2106 data taken 
on consecutive nights (for example, HARPS high-cadence runs) also show strong 
evidence of the same signal. However splitting the data into subsets adds substan- 
tial complexity to the analysis and the results become quite sensitive to subjective 
choices (how to split the data and how to weight each subset). The combination of 
UVES with all the HARPS pre-2016 (Fig. 1a) already produced an FAP of 1%, but a 
dedicated campaign was deemed necessary given the caveats with the sampling and 
activity related variability. The HARPS PRD campaign unambiguously identifies a 
signal with the same period of approximately 11.2 d. As discussed earlier, the combi- 
nation of all the data results in a very high significance (FAP < 10”), which implies 
that the period, but also the amplitude and phase are consistent in all three sets. 
Photometry signals and calculation of the FF’ index. The nightly average of the 
four photometric series was computed after removing the measurements clearly 
contaminated by flares (see Fig. 3). This produces 43 LCOGT epochs in the B and 
V bands (80 nights), and 66 ASH2 epochs in both S 1 and Ha bands (100 nights 
covered). The precision of each epoch was estimated using the internal dispersion 
within a given night. All four photometric series show evidence of a long-period 
signal that is compatible with a photometric cycle at 83 d (probably the rotation) 
reported before’. See periodograms in Extended Data Fig. 3. 

In the presence of spots, it has been proposed that spurious variability should be 
linearly correlated with the value of the normalized flux of the star F, the derivative 
of the flux F’, and the product of FF’ (ref. 61) in what is sometimes called the FF’ 
model. To include the photometry in the analysis of the Doppler data, we used the 
best model fit of the highest-quality light curve (AHS2 S u, which has the lowest 
post-fit scatter) to estimate F, F’ and FF’ at the instant of each PRD observation. 
The relation of F, F’ and FF’ to the Doppler variability is investigated later in the 
Bayesian analysis of the correlations. 

Width of the mean spectral line as measured by m2. The mz measurement contains 
a strong variability that closely mirrors the measurements from the photometric 
time series (see Fig. 3). As in the photometry, the rotation period and its first 
harmonic (approximately 40 d) are clearly detected in the PRD campaign (see 
Extended Data Fig. 4). This apparently good match needs to be verified on other 
stars as it might become a strong diagnostic for stellar activity in M stars. The analy- 
sis of the HARPS pre-2016 data also shows very strongly that m2 is tracing the pho- 
tometric rotation period of 83 d. The modelling of this HARPS pre-2016 requires 
a second sinusoid with P)~85 d, which is peculiar given how close it is to P}. We 
suspect this is caused by photospheric features on the surface changing over time. 
Asymmetry of the mean spectral lines as monitored by m3. The periodogram analysis 
of m3 of the PRD run suggests a signal at 24 d, which is close to twice the Doppler 
signal of the planet candidate (see Extended Data Fig. 5). However, line asym- 
metries are expected to be directly correlated with Doppler signals, not at twice 
nor integer multiples of the Doppler period. In addition, the peak has an FAP of 
5%, which makes it not significantly different from white noise. When looking at 
the HARPS pre-2016 data, strong beating is observed at 179 and 360 d, which is 
probably caused by a poorly sampled signal at that period or longer (possibly a 
magnetic cycle), or some residual systematic effect (possibly contamination by tel- 
lurics). In summary, m3 does not show evidence of any stable signal in the range of 
interest. 

Signal searches in the S-index. Although Ha*! and other lines such as the sodium 
doublet (NaD1 and NaD2)® have been shown to be the best tracers for activity 
on M dwarfs, analysing the time series of the S-index is also useful because of its 
historical use in the long-term monitoring of main-sequence stars. In Extended 
Data Fig. 6 we show the likelihood ratio periodograms for the S-indices of the 
HARPS pre-2016 and PRD time series. As can be seen, no signals were found 
around the 11 d period of the radial velocity signal, however two peaks were 
found close the 1% FAP threshold with periods of approximately 170 and 340 d. 
To further test the reality of these possible signals, we performed a Lomb-Scargle 
periodogram analysis“ of the combined PRD and pre-2016 HARPS data. This 
test resulted in the marginal recovery of both the 170 and 340 d peaks seen in 
the likelihood periodograms, with no emerging peaks around the proposed 11 d 
Doppler signal. The Lomb-Scargle tests revealed some weak evidence for a signal 
at much lower periods, around 7 and 30 d. 

Given that there is evidence for substantial peaks close to periods of 1 yr, its 
first harmonic and the lunar period, we also analysed the window function of the 
time series to check if there was evidence that these peaks are artefacts from the 
combination of the window function pattern interfering with a real long-period 
activity signal in the data. The dominant power in the window function is found 
to increase at periods greater than 100 d, with a forest of strong peaks found in that 
domain, in comparison to that of sub-100 d periods, which is very flat, representing 
the noise floor of the time series. This indicates that there are likely to be strong 
interference patterns from the sampling in this region, and that the signal in the 
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radial velocity data are also not due to the sampling of the data. A similar study in 
the context of the HARPS M dwarf programme was also done on Proxima. They 
compared several indices and finally decided to use the intensity of the chromo- 
spheric sodium doublet lines. They did not report any notable period at the time, 
but we suspect this was due to using fewer measurements and not removing the 
frequent flaring events from the series, which also requires the compilation of a 
number of observations to reliably identify the outliers caused by flares. 

Signal searches in Ha emission. Our likelihood-ratio periodograms for Ha 
(Extended Data Fig. 7) only show non-significant peaks in the 30-40 d period 
range. It is important to note that the analyses described above have been per- 
formed on multiple versions of the data set, in the sense that we analysed the 
full data set without removing measurements affected by flaring, then proceeded 
to reanalyse the activities by dropping data clearly following the flaring periods 
that Proxima went through when we observed the star. This allowed us to better 
understand the impact that flares and outliers have on signal interference in the 
activity indices. Although the distribution of the peaks in periodograms changes 
somewhat depending on how stringent the cuts are, no emerging peaks were seen 
close to an 11 d period. Concerning UVES Ha measurements, our likelihood-ratio 
periodogram did not detect any statistically significant signal. 

Further tests on the signal. It has been shown™ that at least some of the ultraprecise 
photometric time series measured by the CoRot and Kepler space missions do not 
have a necessary property to be represented by a Fourier expansion: the under- 
lying function, from which the observations are a sample, must be analytic. An 
algorithm introduced in ref. 64 can test this property and was applied to the PRD 
data. The result is that, contrary to the light curves aforementioned, claims that the 
underlying function is non-analytic does not hold with the information available. 
Although the null hypothesis cannot be definitively rejected, at least until more 
data are gathered, our results are consistent with the hypothesis that a harmonic 
component is present in the Doppler time series. 

Flares and radial velocities. Among the high-cadence data from May 2013 with 
HARPS, two strong flares are fully recorded. During these events, all of the chro- 
mospheric lines become prominent in emission, Ha being the one that best traces 
the characteristic time dependence of flares observed on other stars and the Sun. 
The spectrum and impact of flares on the radial velocities will be described else- 
where in detail. Relevant to this study, we show that the typical flares on Proxima 
do not produce correlated Doppler shifts (Extended Data Fig. 8). This justifies 
the removal of obvious flaring events when investigating signals and correlations 
in the activity indices. 

Complete model and Bayesian analysis of the activity coefficients. A global 
analysis including all of the radial velocities and indices was performed to verify 
that the inclusion of correlations would reduce the model probability below the 
detection thresholds. Equivalently, the Doppler semi-amplitude would become 
consistent with zero if the Doppler signal was to be described by a linear correlation 
term. Extended Data Fig. 9 shows the marginalized distributions of linear corre- 
lation coefficients with the Doppler semi-amplitude K. Each subset is treated as a 
separate instrument and has its own zero point, jitter and MA term (coefficient) and 
its own activity coefficients. In the final model, the timescales of the MA terms are 
fixed to around 10 d because they were not constrained within the prior bounds, 
thus compromising the convergence of the chains. The sets under consideration are: 
(1) UVES. 70 radial velocity measurements and corresponding Ha emission 
measurements. 

(2) HARPS pre-2016. 90 radial velocity measurements obtained between 2002 
and 2014 by several programmes and corresponding spectroscopic indices: m2, 
m3, S-index and the intensities of the Ha and He 1 lines as measured on each 
spectrum. 

(3) HARPS PRD. 54 Doppler measurements obtained between 18 January- 
31 March 2016, and the same spectroscopic indices as for the HARPS pre-2016. 
The values of the F, F’ and FF’ indices were obtained by evaluating the best fit 
model to the ASH2 S 11 photometric series at the HARPS epochs (see Methods 
subsection ‘Photometry signals and calculation of the FF’ index’). 

An activity index is correlated with the radial velocity measurements in a given 
set if the zero value of its activity coefficient is excluded from the 99% credibility 
interval. Extended Data Fig. 9 shows the equiprobability contours that contain 
50%, 95%, and 99% of the probability density around the mean estimate, and the 
corresponding lo uncertainties in red. Only the F’ index (the time derivative of 
the photometric variability) is substantially different from 0 at high confidence 
(Extended Data Fig. 9m). Linking this correlation to a physical process requires 
further investigation. To ensure that such correlations are causally related, one 
needs a model of the process causing the signal in both the radial velocity and 
the index, and in the case of the photometry one would need to simultaneously 
cover more stellar photometric periods to verify that the relation holds over time. 
Extended Data Table 1 contains a summary of all of the free parameters in the 
model, including the activity coefficients for each data set. 
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Extended Data Figure 1 | Window function. a—c, Window function of the UVES (a), HARPS pre-2016 (b) and HARPS PRD (c) data sets. The same 


window function applies to the time series of Doppler and activity data. Peaks in the window function are periods at which aliases of infinite period 
signals would be expected. The green vertical lines mark the period of the planet candidate at 11.2 d. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved 


LETTER 


RV - UVES 


P,=11.2 days 


a ie 10 20 50. 100 300 1000 
Period [days] 


p= 215 days V RV - HARPS pre-2016 


~ 11.19 days 


mi ay cA is WY. Wt Prete 


5 10.” 20 0 100 300 1000 
Period [days] 


RV - HARPS PRD 


AlnL 


2. 5 20 50 100 300 1000 
Period [days] 


Extended Data Figure 2 | Signal searches on independent radial velocity — shown in Fig. 1. The black and red lines represent the searches for the first 
data sets. a—c, Likelihood-ratio periodograms searches on the radial and second signals, respectively. The green vertical lines mark the period 
velocity (RV) measurements of the UVES (a), HARPS pre-2016 (b) and of the planet candidate at 11.2 d. 

HARPS PRD (c) subsets. The periodogram with all three sets combined is 
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Extended Data Figure 3 | Signal searches on the photometry. later to construct the FF’ model to test for correlations of the photometry 
a-d, Likelihood-ratio periodograms searches for signals in each with the radial velocity data. The black, red and blue lines represent the 
photometric ASH2 photometric band (a, b) and LCOGT bands (c, d). The search for the first, second and third signal respectively. The green vertical 
two sinusoid fits to the ASH2 S 11 series (P; = 84 d, P; = 39.1 d) are used lines mark the period of the planet candidate at 11.2 d. 
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Extended Data Figure 4 | Signal searches on the width of the spectral variability in the HARPS PRD run compares quite well to the photometric 
lines. a, b, Likelihood-ratio periodogram searches on the width of the variability. The black, red and blue lines represent the search for the first, 
mean spectral line as measured by m, for the HARPS pre-2016 (a) and second and third signal, respectively. The green vertical lines mark the 
HARPS PRD data (b). The signals in the HARPS pre-2016 data are period of the planet candidate at 11.2 d. 


comparable to the photometric period reported in the literature and the 
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Extended Data Figure 5 | Signal searches on the asymmetry of the systematic effects or telluric contamination. No signals are detected above 
spectral lines. a, b, Likelihood-ratio periodogram searches on the line the 1% threshold in the HARPS PRD campaign. The black and red lines 
asymmetry as measured by m3 from the HARPS pre-2016 (a) and HARPS represent the search for the first and second signals respectively. The green 
PRD (b) data sets. Signal beating at around 1 yr and 0.5 yr is detected vertical lines mark the period of the planet candidate at 11.2 d. 

in the HARPS pre-2016 data, which is possibly related to instrumental 
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Extended Data Figure 6 | Signal searches on the chromospheric S-index. a, b, Likelihood-ratio periodogram of the S-index from the HARPS pre-2016 
(a) and HARPS PRD (b) campaigns. No signals were detected above the 1% threshold. The green vertical lines mark the period of the planet candidate 
at 11.2 d. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


H, index - UVES 


~23.8 days 


2 y AQ. 20 50 100 300 1000 


OB dQ 30 50 100 300 1000 


2 5 10 20 450 100 300 1000 
Period [days] 


Extended Data Figure 7 | Signal searches on the spectroscopic Ha index. ac, Likelihood-ratio periodogram searches of Ha intensity from the UVES 
(a), HARPS pre-2016 (b) and HARPS PRD (c) campaigns. No signals were detected above the 1% threshold. The green vertical lines mark the period of 
the planet candidate at 11.2 d. 
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Extended Data Figure 8 | Radial velocities and chromospheric time axis is days since JD = 245417.0 d. No trace of the flare is observed 
emission during a flare. a—d, Radial velocities (a) and equivalent width in the radial velocities. Error bars in the radial velocities correspond to 
measurements of the Ha (b), Na doublet lines (c) and the S-index (d) asa lo errors. The formal 1c errors in the equivalent width measurements are 


function of time during a flare that occurred the night of 5 May 2013. The comparable to the size of the points. 
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signal for UVES (a), HARPS pre-2016 (b-f), HARPS PRD campaign (g-k) bar shows the zero value of each activity coefficient. Only Cp is found to 
and the photometric FF’ indices for the PRD campaign only (I-n). Each be substantially different from zero. 
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Extended Data Table 1 | Complete set of model parameters 


Parameter Mean [68% c.i.] Units 
Period 11.186 [11.184, 11.187] days 
Doppler Amplitude 1.38 [1.17, 1.59] ms~! 
Eccentricity <0.35 rs 
Mean Longitude 110 [102, 118] deg 
Argument of periastron 310 [] deg 
Secular acceleration 0.086 [-0.223, 0.395] ms !yr7! 
Noise parameters 

OHARPS 1.76 [1.22, 2.36] ms * 
OPRD 1.14 [0.57, 1.84] ms~! 
CUVES 1.69 [1.22, 2.33] ms! 
P@HARPS 0.93 [0.46, 1] ms~! 
@PRD 0.51 [-0.63, 1] ms~! 
UVES 0.87 [-0.02, 1] ms~! 


Activity coefficients* 


UVES 

CHa -0.24 [-1.02, 0.54] 
HARPS pre-2016 

Cia -0.63 [-4.13, 3.25] 
Cre 1.0 [-9.3, 11.4] 

Cs -0.027 [-0.551, 0.558] 
Cy -1.93 [-6.74, 2.87] 
Cmg 0.82 [-0.60, 2.58] 
HARPS PRD 

Cia 9.6 [-12.9, 33.3] 

Cue -77 [-210, 69] 

Cs -0.117 [-0.785, 0.620] 
Cray -2.21 [-8.86, 7.96] 
Cris -0.02 [-3.67, 3.44] 
PRD photometry 

Cr 0.0050 [-0.0183, 0.0284] 
Cy -0.633 [-0.962, -0.304] 
Crr 4.3 [-6.8, 14.8] 


The definition of all of the parameters is given in Methods subsection ‘Statistical frameworks and 
tools’. The values are the maximum a posteriori estimates and the uncertainties are expressed 

as 68% credibility intervals. The reference epoch for this solution is Julian Date 

to=2,451,634.73146 d, which corresponds to the first UVES epoch. 

*The units of the activity coefficients are metres per second divided by the units of each activity index. 
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Extending the lifetime of a quantum bit with error 
correction in superconducting circuits 
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Quantum error correction (QEC) can overcome the errors 
experienced by qubits' and is therefore an essential component 
of a future quantum computer. To implement QEC, a qubit is 
redundantly encoded in a higher-dimensional space using quantum 
states with carefully tailored symmetry properties. Projective 
measurements of these parity-type observables provide error 
syndrome information, with which errors can be corrected via 
simple operations’. The ‘break-even’ point of QEC—at which the 
lifetime of a qubit exceeds the lifetime of the constituents of the 
system—has so far remained out of reach?. Although previous works 
have demonstrated elements of QEC*', they primarily illustrate 
the signatures or scaling properties of QEC codes rather than 
test the capacity of the system to preserve a qubit over time. Here 
we demonstrate a QEC system that reaches the break-even point 
by suppressing the natural errors due to energy loss for a qubit 
logically encoded in superpositions of Schrédinger-cat states!” of a 
superconducting resonator'*”!. We implement a full QEC protocol 
by using real-time feedback to encode, monitor naturally occurring 
errors, decode and correct. As measured by full process tomography, 
without any post-selection, the corrected qubit lifetime is 320 
microseconds, which is longer than the lifetime of any of the parts 
of the system: 20 times longer than the lifetime of the transmon, 
about 2.2 times longer than the lifetime of an uncorrected logical 
encoding and about 1.1 longer than the lifetime of the best physical 
qubit (the |0)¢ and |1)¢ Fock states of the resonator). Our results 
illustrate the benefit of using hardware-efficient qubit encodings 
rather than traditional QEC schemes. Furthermore, they advance 
the field of experimental error correction from confirming basic 
concepts to exploring the metrics that drive system performance 
and the challenges in realizing a fault-tolerant system. 

Implementing QEC in the laboratory is challenging, requiring a 
complex system with many qubits. Even for a perfectly realized QEC 
system of finite size, there will always be unrecoverable errors or failure 
modes, resulting in an exponential decay of the information over time. 
In fact, error correction first introduces a hardware overhead penalty, 
because an uncorrected logical qubit consisting of n physical qubits 
(for typical first-order codes n= 5-10; ref. 22) will experience 
decoherence that is of order n times faster. A central goal of QEC 
is to suppress the naturally occurring errors and surpass the break- 
even point, at which the lifetime gain due to error correction is larger 
than this overhead penalty. These considerations motivate exploring 
a hardware-efficient approach to QEC, with which it may be more 
tractable to not only overcome the entire overhead, but to pinpoint the 
leading limitations to fault-tolerance. 

The encoding of logical states as superpositions of Schrédinger-cat 
states (hereafter, ‘cat code’) that we implement here is a hardware- 
efficient scheme that requires fewer physical resources and introduces 
fewer error mechanisms than do traditional QEC proposals. Designed 


to operate within a continuous-variable framework’, the cat code 
exploits the fact that a coherent state |) is an eigenstate of the resonator 
lowering operator @: Gla) = ala). Using a logical basis comprised of 
superpositions of cat states, which are eigenstates of photon-number 
parity, the cat code requires just a single ancilla to monitor the dominant 
error due to single photon loss induced by resonator energy damping. 
This error channel gives rise to two effects: deterministic energy decay 
of the resonator field to vacuum and the stochastic application of 4, 
which results in a change of photon-number parity of any state within 
the cat code. The former becomes a limiting factor only at small reso- 
nator field amplitudes when coherent state overlap can no longer be 
neglected and can be addressed through either dissipative pumping 
approaches” or unitary gates. The latter, photon loss, is accompanied 
by phase shifts of 1/2 about the Z, axis within the logical space, indi- 
cating that by monitoring photon parity as the error syndrome we 
adhere to the prescriptions for error correction by translating single 
photon loss into a unitary operation on the encoded state!®! (Fig. 1): 


Afed|Cx) + alCi,)) « (la) —|— 2)) +4 (Fa) — | ia) 


= €o|Ca) + i|C;,) 


where co and c are arbitrary coefficients satisfying |co|* + |c,|?=1 and 
ICija) = (Ia) | — (i)w))/-/2 (the normalization factor -/2 holds 
in the limit of large a (refs 17 and 18)). By detecting photon jumps in 
real-time with quantum non-demolition parity measurements”!, we 
learn how the phase relationship between the basis states changes, 
thereby protecting the encoded qubit from the dominant error channel 
of the system. The rate of photon jumps scales linearly with the average 
photon number fi (ref. 17), which exactly mirrors the aforementioned 
overhead faced by traditional QEC codes??°, Thus, when 
implementing the cat code, a central figure of merit when assessing the 
performance of the QEC system will be the degree to which we can 
overcome the encoding overhead with the application of fast, repeated 
parity measurements in time. 

We use a 3D circuit quantum electrodynamics (QED) architecture”® 
consisting of a single transmon qubit coupled to two waveguide 
resonators”!?’, The transmon is used as an ancilla both to provide 
the error syndrome and to encode and decode the logical states 
(Supplementary Information, section 3). One resonator stores the 
logical states while the other is used for ancilla readout and control. 
The dominant storage—ancilla interaction terms are described by the 
following Hamiltonian: 


H/h= waa + (w.— x,"a)]e)(e|— ata? 


with |e)(e| the ancilla excited state projector, w, and w, the storage 
resonator (henceforth the resonator) and ancilla transition frequencies, 
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Figure 1 | The cat-code cycle. In the logical encoding of 

\0) = |C) = |a) +|—a) and|1) =|C#) = |ia) +|—ia) (normalizations 
omitted), the two ‘2-cats’ |C®) and|C;,) are eigenstates of either even (+) 
or odd (—) photon-number parity (an ‘n-cat’ is a superposition of 

n coherent states). For large enough |a| they are effectively orthogonal to 
one another. In this basis, the states along the logical axes +X. and +Y. are 
both “4-cats’ of even or odd parity as well. The different patterns in the 
fringes of their cartoon Wigner functions signify the different phase 
relationship between the basis states. These features allow one to store a 
qubit in a superposition of 2-cats, |w) = co|C2) + q|Cj,), and at the same 
time monitor the parity as the error syndrome without learning anything 
about cp or c;, the arbitrary coefficients satisfying |co|* + |c;|?= 1. In this 
example, we choose to encode |0) and |1) in the even-parity basis, although 
the odd basis can equally be chosen. The loss of a single photon changes 
not just the parity of the basis states (red shading, even; blue shading, odd), 
but the phase relationship between them by a factor of i as well: 

A(co|C4) + a|Ci,)) =co|C,) + iq|C;,). Therefore, after one photon jump, 
one finds the initial qubit rotated by 1/2 about the logical Z, axis. With 
each subsequent application of 4, the encoded state cycles between the 
even- and odd-parity subspaces, while, owing to each consequent 
multiplication of the coefficient c, by i, the encoded information rotates 
about the Z, axis by 1/2, as indicated by the rotation of the green shaded 
slice. Between the stochastic applications of @, the cat states 
deterministically decay towards vacuum: a — ae~*"? (not depicted here), 
indicating that the logical basis changes in time. 


a 


respectively, \sa/(27) © 1.95 MHz the dispersive frequency shift, K/(27) 
4,5 kHz the resonator anharmonicity, or Kerr, and f is the reduced 
Planck constant. The ancilla has coherence times T, ~ 351s and 
Tz 13 ps; the resonator has a single-photon Fock state relaxation time 
T- 250s and coherence time T5 = 330 ps. To perform high-fidelity 
single-shot measurements of the ancilla”’, we set the readout resonator 
to have a 1-MHz bandwidth and use a nearly quantum-limited 
phase-preserving amplifier, the Josephson parametric converter 
(JPC)®, as the first stage of amplification, which allows for a readout 
fidelity of 99.3%. The error syndrome is measured using a Ramsey-style 
pulse sequence consisting of two 7/2 pulses applied on the ancilla and 
separated in time by 1/Vsa 250 ns (ref. 30). The subsequent ancilla 
projective readout takes about 700 ns, which includes integration times, 
cable latencies and feedback delays. A change in the ancilla state after 
the Ramsey mapping indicates a change in parity, or the loss of one 
photon (assuming negligible photon excitation) with 98.5% fidelity 
(Supplementary Information, section 1). The total duration of each 
error syndrome measurement is just 1 1s, or approximately 0.8% of the 
average time between photon jumps for cat states with 7=2. 

We use a new real-time controller designed to execute programs for 
quantum control (Supplementary Information, section 5). Every 
repetition of the program (Fig. 2a-d) begins with the controller 
encoding one of the six cardinal points on a Bloch sphere in the even 
logical basis states, enough to perform process tomography of the full 
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QEC system”. The number of syndrome measurements and the waiting 
time ty between them is set to an optimal value to balance the risk of 
missing photon jumps and the possibility of measurement back-action 
on the resonator state due to ancilla T, (Supplementary Information, 
section 4; more details below); for cat states of 7 +3, ty 131s. The 
program uses a state machine for adaptive parity monitoring, in which 
the sign of the second 1/2 pulse in the parity mapping is chosen in 
real-time to maximize the probability of measuring the ancilla in its 
ground state |g); this improves measurement reliability by decreasing 
the probability that the state of the ancilla will change during 
measurement. 

The program stores in memory a record of ‘0’s (no error) and 
‘1’s (error) that specifies the monitoring history; Fig. 2b shows the 
four possibilities for two steps: {00, 01, 10, 11} with probabilities 
{70.4%, 13.7%, 11.8%, 4.1%}. The asymmetry in the occurrence of 01 
and 10 is due to parity measurement infidelity and is well-modelled by a 
Bayesian analysis (Supplementary Information, section 7). Conditioned 
on obtaining one of these four records, Wigner tomography provides 
a striking visual demonstration of the cat code in action. Interference 
fringes, signatures of quantum coherence!’, continue to be sharp and 
extremal as the program proceeds in time, as compared with the case 
of performing no parity monitoring. At each point in the program the 
tomograms agree well in parity contrast, phase and amplitude as seen 
in simulations. These levels of predictability highlight the advantages of 
this hardware-efficient scheme: knowing the Hamiltonian parameters 
and the measurement fidelity of a single error syndrome is sufficient to 
encapsulate the evolution of an error-corrected logical qubit. 

Two further applications of feedback are necessary to maximize the 
performance of the QEC system. Owing to the non-commutativity of 
the Kerr Hamiltonian £4724? and 4, a photon jump results in a phase 
shift of the resonator state in phase space proportional to K and the 
jump time #;: 0x = Kt, (ref. 18). The controller must therefore use the 
monitoring history, which provides a best-estimate of t, to consolidate 
trajectories of equal error number, yet different error timestamp, into 
a single effective resonator state in real-time; for example, before 
decoding, 01 and 10 become a single ‘1 error’ state. The controller also 
decides in real-time to apply a different set of decoding pulses on the 
basis of the final parity. Figure 2c shows qubit state tomography of 
the ancilla after decoding, but before correction, conditioned on the 
number of errors. The rotation of the six cardinal points by 1/2 for one 
error and 7 for two errors indicates that the cat code successfully maps 
photon-loss errors in the resonator onto a unitary operation on the 
encoded qubit. Upon completion of execution of the program, the 
knowledge of how many errors occurred is equivalent to having 
corrected the state. Although aligning the Bloch spheres of all error 
trajectories to the same orientation requires a simple phase adjustment 
on the ancilla drive in the decoding sequence, in this example we 
instead choose to show the performance of each error case individually. 
The program thus returns the corrected qubit, now stored again in the 
ancilla, completing the full QEC cycle. 

We benchmark the performance of our QEC system by performing 
process tomography of the QEC system. We use the chi matrix 
representation for a single qubit”, in which state tomography of the 
output density matrix pry is used to calculate the measured, complex 
4 x 4 chi matrix X™. The fidelity F = tr(X™X,) is defined as the overlap 
of X™ with Xo, the chi matrix for the identity operator T , the ideal case 
in which a QEC system corrects a state perfectly. Shown in Fig. 2e are 
the process matrices for the QEC program demonstrated in Fig. 2a—d. 
The form of X™ (X™ for j=0, 1 and 2 errors) matches the process 
matrix for ideal rotations about the Z axis (defined by the Pauli matrix 
0) by jn/2, Xjx/2. Signatures of developing incoherent mixture are 
evident from the non-zero values in all diagonal elements. 

Moving to an initial encoding size of 7)=2 to reduce the probability 
of photon jumps, we implement the cat code with up to six syndrome 
measurements over approximately 1101s (Fig. 3a). As in Fig. 2b, each 
repetition is separated by an optimized waiting time that depends on 
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Figure 2 | Example of a two-step quantum trajectory executed by the 
QEC state machine. a, Six cardinal points on the Bloch sphere pinit are 
encoded (‘E’) from the ancilla onto even-parity resonator states; green 
markers indicate the initial coordinate-system orientation. A “Wigner 

i 1 i 
[tomography] snapshot’ is shown for i ({0) —|1)) > FE (|Cz) — |Cia))s 


7ip=3; (3 is the amplitude of varying coherent displacements (Dg|0)r =|8)) 
and the average parity (P) = (B5PD3)s where P= ei7™#"a, b, A state machine 
for adaptive parity monitoring with delays t,, + 13 js between each 
measurement. ‘Parity measurement’ rectangles show the Ramsey sequence 
that maps even [+/—] (odd [+/+]) parity onto ancilla |g) (|e)); the + or — 
specifies the sign of the 7/2 pulse. Diamonds indicate branching on ancilla 
measurement (0 — no error, |g); 1 — error, |e)); “n pulse on ancilla 
rectangle indicates ancilla reset (|e) — |g)); clocks indicate recording of 
the error time ¢,. Dashed purple arrows emphasize the phase difference 
between 10 and 01 due to 0x. Rotations Oxy are due to cross-Kerr 
interactions between the readout and storage resonators during ancilla 


the average photon number in the resonator, starting at fy 15s and 
increasing after each subsequent step up to ty 25 1s to account for the 
decay of the average photon number. Without post-selection, the cat 
code outperforms the uncorrected transmon with a time constant of 
exponential decay that is a factor of about 20 higher, indicating that 
although the coupling between the resonator and transmon is always 


Occurrence: 4.1% 100% of data 


measurements. The parity (Wigner tomogram origin) matches the best 
estimate (border colour); tomograms match the expected resonator state as 
seen in simulations. c, The feedback aligns all states by changing the phase 
of subsequent resonator drives (for example, for Wigner tomography or 
decoding pulses) to account for 0x and @y. Ancilla tomography after 
decoding shows the expected rotations of 1/2 per error about Z (green 
markers). Different decoding pulses (‘D’) are chosen in real-time 
depending on the parity of the final state. d, The correction to obtain the 
final state psn is made via coordinate system rotations (R?, where Z is the 
qubit axis defined by Pauli matrix a) by 0 (for 0 errors), —1/2 (for 1 error) 
or —7 (for 2 errors) in software. e, Process tomography results for j= 0, 1 
and 2 errors before correction. Ideal Xjx;2 process matrices are shown in 
wire-outlined bars. Experimental data for X M are shown in solid bars; the 
values are complex numbers with the amplitude on the vertical axis and an 
argument specified by the bar colour. Amplitudes of less than 0.01 are not 
depicted. Process tomography after correction is shown to the right of the 
arrow. I, the identity matrix; F, fidelity. 


on, the efficient extraction of error syndromes using an ancilla with 
inferior coherence properties still allows for substantial gains in 
lifetime. Moreover, the cat code surpasses the decay of the uncorrected 
cat code by a factor of about 2.2, demonstrating that applying error 
correction to the logical encoding makes up for the faster error rates 
introduced by the hardware overhead. The palpable difference in 
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Figure 3 | QEC process tomography. a, To implement QEC, we 
redundantly encode the qubit in cat states (7ip=2) and pay the required 
overhead penalty, which is ubiquitous to QEC. This leads initially to worse 
performance; the process fidelity (F(t)) of the uncorrected cat code 
(orange circles), where cat states are left to decay freely between encoding 
and decoding, exhibits faster decay as compared to the Fock states |0);and 
|1)¢ (grey circles). Substantial improvements in performance are realized 
with the full QEC system; the corrected cat code (red triangles) surpasses 
the uncorrected transmon (green squares) by a factor of about 20, makes 
up for the QEC overhead by a factor of about 2.2, and offers an 
improvement over the Fock-state encoding by a factor of about 1.1. With 
only high-confidence trajectories (blue diamonds), the decay time T 
increases to T > 0.5 ms. The top axis indicates the number of syndrome 
measurements used for each point in the corrected cat code. Cat code data: 


contrast between the cardinal points of the uncorrected versus the 
corrected cat codes after approximately 1101s (Fig. 3b) demonstrates 
that with a full QEC system we can enhance the lifetime of a qubit 
without giving preference to any one direction on the Bloch sphere. By 
decaying with a time constant that exceeds that of the Fock state by a 
factor of 1.1, this system reaches the break-even point of QEC. 

The history of errors also provides us with a valuable measure 
of confidence that the result of an error syndrome measurement 
faithfully reflects the actual error history. Indeed, a ‘low-confidence’ 
measurement record that suggests two or more consecutive errors (for 
example, 11 as in Fig. 2c) has a much lower probability of faithfully 
reflecting the true error trajectory of the resonator state than does a 
‘high-confidence’ record, wherein a 1 is ‘confirmed’ by a subsequent 0 
(Supplementary Information, section 7). If we accept only high- 
confidence trajectories, still keeping 80% of the data after 1001s, then 
we obtain a decay constant of over half a millisecond. The marked 
improvement we observe when excluding ‘low-confidence trajectories 
points to parity measurement infidelity, primarily due to ancilla 
decoherence, as the dominant limitation on cat-code performance. 

An overall analysis of the budget for the lifetime gain for our QEC 
system is shown in Table 1, which lists the dominant avenues of code 
failure common to any QEC system and encapsulates the challenges 
one faces in realizing fault-tolerant QEC. Contributions from the first 
five entries in Table 1 can be suppressed by measuring more quickly 
and using a quantum filter to estimate the parity at any given time”', as 
seen in the column where ft, + 0s. However, errors due to the ancilla 
T} persist regardless of measurement rate. Owing to its dispersive 
coupling to the resonator, a change in the energy of the ancilla at an 
unknown time imparts an unknown rotation to the resonator state in 
phase space; this is the forward propagation of an error. Measuring the 
syndrome more frequently only increases the likelihood of ancilla- 
induced dephasing, necessitating the aforementioned optimized 
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100,000 averages per point; transmon, Fock states: 50,000 averages per 
point; error bars are smaller than marker sizes. Although no data exhibits 
strictly single-exponential decay, all curves are well modelled by 

F(t) =0.25 + Ae" (dotted lines), with 7 the decay time of the specific 
qubit storage scheme and A a fitting constant that is ideally equal to 0.75. 
F=0.25 (grey dashed line) implies a complete loss of information. 
Uncertainties are given by the errors (on 7) in the fit. Fluctuations in the 
uncorrected cat code are explained by the Kerr effect and are reproduced 
in simulation. b, State tomography after approximately 110 1s 
(corresponding to black arrows in a). Transmon and Fock-state Bloch 
spheres show amplitude damping. Bloch sphere shrinking for the cat code 
is well-characterized by a depolarization channel. The system substantially 
benefits from QEC, as seen from the greater definition of each cardinal 
point in the corrected versus uncorrected case. 


Table 1 | Failure modes of the corrected logical qubit 


Failure mode Dominant source Maximum rate, Optimal rate, 
tw0 ps ty20 ps 
Predicted + 

Double errors Cavity 4-4 40 ms 1.7 ms 
Uncorrectable errors Cavity a! 6ms 6ms 
Readout error Transmon T, 7 ms 2ms 
Ancilla preparation Transmon I} 300 ms 900 us 
Undesired couplings Cavity 4°24? 600 ms 3ms 
Forward propagation Transmon 7; 200 is 600 us 
Net lifetime Predicted 200 is 320 us 

Measured = 318 js 
Gain over uncorrected logical qubit 14 22 
Gain over best physical qubit 0.7 1,1, 


This table shows the predicted decay time constant (r) of quantum information stored in a 
corrected logical qubit using the cat-code paradigm under a scenario in which each individual 
failure mode is the only source of loss. Dominant modes of failure in the cat code are: 

double errors (a followed by 4) between consecutive syndrome measurements separated 

by atime ty; possible errors that the cat code does not address, such as additions of a single 
photon (4°); a failed parity mapping resulting from ancilla dephasing (7,,); incorrect ancilla 
initialization before syndrome measurement resulting from unknown excitations (I}) of the 
ancilla during ty; undesired couplings that result in dephasing due to Kerr (a'24); and ancilla 
decoherence that directly propagates to unrecoverable errors in the resonator state, which 

is a result of ancilla decay or excitation (71). Two different measurement strategies are shown 
for an initial Ag=2: as quickly as possible (tw Os) and the optimal monitoring time (ty ~ 20 ps). 
The lowest two rows show the multiplicative gains of cat-code performance over the decay 
constants of the uncorrected logical qubit (1471s) and the best physical qubit of the system 
(287 us, Fock states |0);,|1),). These gains reflect the combined effects of all loss channels acting 
together. We predict all numbers using an analytical model derived in Supplementary 
Information, section 6, and show that for the net gains the failure modes do not contribute 
independently. Using the optimal measurement strategy, we find that the predicted gains in 
lifetime over the uncorrected logical qubit and over the Fock state encoding match the 
measured performance of the corrected cat code (3181s) shown in Fig. 3. Lifetimes of at least 
6 ms would be possible if the forward propagation of errors from the syndrome measurements 
were abated. 
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measurement cadence that sets the delay between syndrome measure- 
ments from ty Os to, on average, ft, +20 1s for 7#p=2. We therefore 
see that when designing a QEC system, sources of decoherence beyond 
double-errors per round of correction can motivate substantially slower 
measurement rates. However, because the cat code performs at the 
break-even point even in the presence of all of these sources of loss, we 
are optimistic about the prospect of realizing a fault-tolerant QEC 
system. Indeed, supplementing the cat code with a scheme that abates 
ancilla back-action promises to allow increased error syndrome 
measurement rates (a lower t,,) and thus greater gains in lifetime. 

Our results show that QEC can actually protect an unknown bit 
of quantum information, and extend its lifetime by active means. 
Furthermore, we demonstrate the crucial role of real-time feedback 
with pulses that depend on the evolution of the quantum system, an 
addition to the experimental setup that greatly improves error correc- 
tion performance and allows us to realize the cat code at the break-even 
point of QEC. Future goals include combining the cat code with mech- 
anisms to re-inflate cat state amplitudes” and to equip the parity mon- 
itoring protocol to handle changes in ancilla energy, thereby addressing 
issues of non-fault-tolerance. With such capabilities we can then move 
beyond using the cat code as a quantum memory only and begin cou- 
pling multiple resonators together to demonstrate operations between 
error-corrected logical qubits’®. Our results motivate the adoption of 
hardware-efficient QEC schemes and demonstrate the promise of cat 
states as a basis for future quantum computing applications. 
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Molecular modifiers reveal a mechanism of 
pathological crystal growth inhibition 


Jihae Chung!, Ignacio Granja’, Michael G. Taylor?, Giannis Mpourmpakis’, John R. Asplin? & Jeffrey D. Rimer! 


Crystalline materials are crucial to the function of living organisms, 
in the shells of molluscs!~’, the matrix of bone’, the teeth of sea 
urchins’, and the exoskeletons of coccoliths®. However, pathological 
biomineralization can be an undesirable crystallization process 
associated with human diseases’~*. The crystal growth of biogenic, 
natural and synthetic materials may be regulated by the action of 
modifiers, most commonly inhibitors, which range from small 
ions and molecules!!! to large macromolecules!”. Inhibitors 
adsorb on crystal surfaces and impede the addition of solute, 
thereby reducing the rate of growth'*'*. Complex inhibitor-crystal 
interactions in biomineralization are often not well elucidated’. 
Here we show that two molecular inhibitors of calcium oxalate 
monohydrate crystallization—citrate and hydroxycitrate—exhibit 
a mechanism that differs from classical theory in that inhibitor 
adsorption on crystal surfaces induces dissolution of the crystal 
under specific conditions rather than a reduced rate of crystal 
growth. This phenomenon occurs even in supersaturated solutions 
where inhibitor concentration is three orders of magnitude less 
than that of the solute. The results of bulk crystallization, in situ 
atomic force microscopy, and density functional theory studies are 
qualitatively consistent with a hypothesis that inhibitor-crystal 
interactions impart localized strain to the crystal lattice and that 
oxalate and calcium ions are released into solution to alleviate this 
strain. Calcium oxalate monohydrate is the principal component 
of human kidney stones!*!’ and citrate is an often-used therapy”, 
but hydroxycitrate is not. For hydroxycitrate to function as a 
kidney stone treatment, it must be excreted in urine. We report 
that hydroxycitrate ingested by non-stone-forming humans at an 
often-recommended dose leads to substantial urinary excretion. 
In vitro assays using human urine reveal that the molecular 
modifier hydroxycitrate is as effective an inhibitor of nucleation of 
calcium oxalate monohydrate nucleation as is citrate. Our findings 
support exploration of the clinical potential of hydroxycitrate as an 
alternative treatment to citrate for kidney stones. 

Figure 1a illustrates the habit of calcium oxalate monohydrate 
(COM) crystals with indexed faces. Here, we examine COM crystalli- 
zation in the presence of two inhibitors with nearly identical structure 
(Fig. 1b): citrate (CA) and hydroxycitrate (HCA). These two molecules 
differ only by a single alcohol group, yet this subtle difference markedly 
alters their specificity for binding to COM crystal surfaces. Scanning 
electron microscopy (SEM) images of COM crystals (Fig. 1c) reveal that 
CA alters growth in the c direction (Fig. 1d), which is consistent with 
results from prior atomic force microscopy (AFM) measurements”. 
Conversely, HCA binds to the apical tips (that is, the {121} and {021} 
surfaces) and generates diamond-shaped crystals (Fig. le). Inhibitor- 
crystal interactions reduce the c/b aspect ratio (Fig. 1f) as well as the 
[100] dimension (Fig. 1g) of COM crystals. Kinetic studies using an 
ion selective electrode (ISE) to track the temporal depletion of free Ca* 
ions*’ in supersaturated calcium oxalate solution (Extended Data 
Fig. 1) indicate that HCA is a more potent growth inhibitor (Fig. 1h). 


Note that the third dissociation constant of CA (pK, =6.4) is close to the 
pH of COM growth solution in this study (pH 6.2), thus indicating a 
distribution of CA species with charge of either —3 or —2 (Extended Data 
Fig. 2a). To ensure that we are comparing the effects of fully dissociated 
HCA and CA, we performed ISE measurements of COM growth at pH 
8.0 (Extended Data Fig. 2b), well above the pK, of CA, and found that 
HCA is the more effective inhibitor, irrespective of solution alkalinity 
(see Methods for a detailed discussion). Increasing concentrations of both 
CA and HCA leads to a maximum 60% inhibition of COM crystallization. 
This net effect reflects the reduction in crystal growth rate as well as the 
potential inhibition of COM nucleation. To assess the latter, we measure 
the number of crystals collected on substrates per unit area, or the crystal 
number density pcom. As shown in Fig. 1i and Extended Data Fig. 3, 
HCA and CA both inhibit COM nucleation, resulting in 62% + 15% and 
39% + 17% reductions in pcom: respectively. 

Inhibitor interactions with hillocks presented on the surfaces of COM 
crystals reduce the rate of step advancement. In situ AFM measurements 
confirm that CA exhibits specificity for steps on the basal (100) surface 
of COM crystals that advance in the c direction. Time-resolved images 
also show that a CA concentration of >1 1g ml! reduces interstep 
distance, decreases step velocity, and generates protrusions on steps 
(Fig. 1j). Continuous imaging of COM (100) hillocks reveal that 
1g ml~' HCA reduces interstep distances near the origin of screw dis- 
locations (arrow in Fig. 1k) and slows the rate of step advancement. 
In situ AFM measurements of the COM (010) surface indicate that HCA 
inhibits the growth of hillocks bounded by {121} and {021} steps leading 
to the disappearance of distinct step edges within minutes of introducing 
the inhibitor (Fig. 11). Prior in situ AFM studies of COM have been 
reported wherein macromolecular inhibitors, such as proteins”! and 
glycosaminoglycans”, have similar effects on surface growth. 

When evaluating the lower limits of CA and HCA efficacy by AFM, 
a unique mode of action compared to conventional mechanisms!*7+5 
of crystal growth inhibition was noted at inhibitor concentrations of 
<0.25 1g ml‘. Time-resolved images of COM (100) growth in the pres- 
ence of CA reveal the appearance of etch pits (Fig. 2a and Supplementary 
Video 1) that increase in depth d with continuous imaging (Fig. 2b). 
A similar phenomenon was observed for HCA. Periodic snapshots 
from Supplementary Video 2 presented in Fig. 2c show a hillock 
on the (100) surface that is initially growing in the absence of inhibitor 
at supersaturation ratio S=4.1, and is then subjected to the same 
growth solution containing HCA with a molar ratio Ca**/HCA ~ 10°. 
Etch pits immediately form once HCA is introduced into the AFM 
liquid cell. The etch pits appear to originate at step edges and evolve 
in both depth and width with imaging time (Extended Data Fig. 4). 

To our knowledge, this is the first observation of crystal dissolution 
ina highly supersaturated growth environment. This effect cannot be 
attributed to inhibitor complexation with free Ca?" ions in solution, 
which would require comparable concentrations of inhibitor and solute 
to reduce calcium concentration below the solubility of COM crystals. 
Here the inhibitor concentration is nearly three orders of magnitude 
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Figure 1 | Effect of inhibitors on COM crystallization. a, COM crystal 
habit with indexed faces. b, Molecular structures of CA and HCA. 

c-e, SEM images of COM crystals in the absence of inhibitor (control C) 
(c) and in the presence of 20,1g ml! CA (d) and in the presence of 

20g ml! HCA (e). Scale bars, 20 jum. f, Changes in COM [001]/[010] 
(or c/b) aspect ratio with inhibitor concentration, Cinnibitor (2 > 150). 
The schematics (next to f and h) depict inhibitor specificity. 

g, Comparison of COM [100] thickness at Cinhibitor = 20g ml! 

(n > 150). h, Percentage inhibition of COM crystal growth as a 

function of Cinhibitor (n > 8). i, Comparison of COM crystal number 
density pcom at Cinhibitor = 20 ug ml“! (n > 3 batches; see Extended 

Data Fig. 3). Error bars in f and h are 2 standard deviations (s.d.); those 
in g and iare 1 s.d. j-1, AFM deflection mode images of a (100) surface 
(j, k) and a (010) surface (1) during in situ measurements in supersaturated 
solution (S = 4.1). Images of control surfaces (top panels of j-1) reveal 
two-dimensional layer growth. At Cca = 2.1 ,1g ml (bottom panel of j) 
we observe jagged step edges, rounding of hillocks, and reduced interstep 
distances (arrows). Measurements at Cyc = 1 1g ml“! (bottom panels of 
k and 1) indicate the appearance of step protrusions and decreased 
interstep distances on the (100) face, whereas step edges become 
indistinguishable on the (010) face. Scale bars, 1 jum. 
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less than that of calcium, suggesting that the effect is related to specific 
inhibitor-crystal interactions. 

Time-resolved in situ AFM images of COM (010) surface growth 
indicate that dissolution occurs within a narrow range of HCA 
concentration, Cyca. Inhibitor-crystal interactions reduce step 
velocity in the [121] and [021] directions at Cuca < 0.08 1g ml~!. The 
velocity monotonically decreases with increasing Cyca (Fig. 2d, e) in 
a manner that suggests that HCA preferentially binds to step sites on 
the COM (010) surface (Extended Data Fig. 5). Within the range 
0.08 ug ml“! < Cuca < 0.15 tg ml~!, step velocities become negative 
(Fig. 2e) and layers uniformly dissolve. At Cyca > 0.15 .g ml~! 
we begin to observe the disappearance of distinct steps, as indicated in 
Fig. 11. Snapshots from Supplementary Video 3 (Fig. 2f) show the 
explicit effect of HCA on [121] and [021] step advancement. 
Immediately upon introducing a supersaturated solution (S = 4.1) with 
Cyca =0.10j1g ml“! into the AFM liquid cell, steps recede towards the 
centre of the hillock and etch pits (arrows in Fig. 2f) appear on terraces 
at later imaging time. As a means of comparison, we also assess COM 
(010) surface dissolution in an undersaturated solution (S = 0.5) 
without any inhibitor where time-elapsed images from Supplementary 
Video 4 reveal negative step velocity and layers uniformly receding 
(Fig. 2g) in a manner that is almost identical to the effect of HCA in 
supersaturated solution. 

Few studies in literature postulate that inhibitors are capable of 
inducing crystal dissolution in supersaturated solution. Lutsko et al.7° 
simulated the effect of occluded impurities in crystals (illustrated in 
Fig. 3a) and showed that negative step velocity can be achieved when the 
impurity is sufficiently large and reaches a high surface coverage (that is, 
small separation distances Ax between occluded impurity molecules). 
An alternative hypothesis for localized (or virtual) dissolution along 
step edges'? describes the effect of modifiers on spiral growth originat- 
ing from screw dislocations. This mode of action leads to a reduced rate 
of spiral growth (illustrated in Fig. 3b). The reduced velocity of steps 
on COM (010) surfaces at low inhibitor concentration is qualitatively 
consistent with this theoretical mechanism; however, the trend deviates 
from pre-existing models at higher inhibitor concentration. 

For COM growth inhibition at this condition, we propose a new 
mechanism to describe the effect of CA and HCA based on changes in 
surface energy ywhen the inhibitor adsorbs on COM crystal steps (Fig. 3c) 
or terraces. For cases when inhibitor—-crystal interactions are more 
energetically favourable than solute-crystal interactions, the adsorbed 
inhibitor imparts an interfacial strain on the crystal lattice. Strain fields 
are illustrated in Fig. 3c and d with an arbitrary radius of curvature that 
includes nearest-neighbour interactions between carboxylic acids of the 
inhibitor and the carboxylic acid of surface oxalates (Ox) formed via a 
calcium bridge, (inhibitor COO~...Ca?*... “OOCox, com. We postulate 
that steps dissolve and release Ox and calcium ions into solution to 
alleviate this strain, thus generating fresh crystal interfaces for addi- 
tional inhibitor—crystal interactions to perpetuate dissolution, causing 
steps to recede towards the centre of the hillock (that is, negative 
step velocity). At higher inhibitor concentration, increased coverage 
of inhibitor on COM crystal surfaces places the adsorbed molecules 
in closer vicinity to each other on nearby step sites. This seemingly 
slows the release of solute from crystal surfaces owing to mass trans- 
port limitations, or steric hindrance, wherein inhibitor diffusion on, 
or desorption from, the COM surface is rate-limiting. The competing 
effects of inhibitor adsorption/desorption and solute attachment/ 
dissolution produce corrugated steps (illustrated in Fig. 3d), which is 
consistent with AFM topographical images of COM (010) surfaces at 
high HCA concentration (see Fig. 11). 

We used density functional theory (DFT) calculations to rationalize 
the observed experimental trends and shed light on the action of inhib- 
itors on COM crystal growth (see Methods for details). The calculated 
HCA binding strength on COM (100) and (021) surfaces (Fig. 3e, f) 
relative to that of an Ox molecule (Extended Data Fig. 6) indicates that 
HCA-crystal interactions are more energetically favourable. The net 
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Figure 2 | Time-resolved imaging of COM 
crystal dissolution. a, In situ AFM images of a 
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; show etch pit formation on the (100) surface at 
Cca=0.10p1g ml~!. b, Depth profile of the 
etch pit at t= 734s evaluated along the 

yellow dashed line (in right panel of a). 

c, Snapshots from Supplementary Video 2 

(t= 0-461 s) show etch pit formation ona 
COM (100) surface at Cuca = 0.25 ug ml!. 

d, Measurements of step advancement in the 
[121] direction as a function of imaging time at 
varying Cyca. Solid lines are linear regression. 
e, Step velocity in the [021] and [121] direction 
monotonically decreases with increased HCA 
concentration at Cuca < 0.08 g ml~', 
whereas steps recede (that is, negative 
velocity) at Cyca = 0.08-0.15 pg ml~!. Dashed 
lines are interpolated. Error bars are 2 s.d. 
(n=5). f, Time-resolved AFM deflection 
mode images from Supplementary Video 3 
(t= 0-807 s) show a receding hillock on a 
(010) surface at Cuca =0.10,1g ml~!. g, AFM 
deflection mode images from Supplementary 
Video 4 (t= 0-923 s) show a (010) hillock 
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difference between HCA and Ox binding to the COM (100) surface is 
—31.0kcal mol! (BEyca = —87.3 and BEo, = —56.3 kcal mol7!), and 
the corresponding difference on the (021) surface is —94.7 kcal mol! 
(BEyca = —170.2 and BEox = —75.5 kcal mol7!). The binding pref- 
erence of HCA for the (021) surface is qualitatively consistent with 
experimental findings that HCA exhibits a specificity for the apical 
tips of COM crystals, thus explaining its ability to reduce [021] step 
velocity. On this basis, we expect that HCA adsorption induces 
higher strain on the (021) face than the (100) face. Indeed, par- 
tial relaxation of COM surfaces in the presence of adsorbed HCA 
reveals that the (100) face is geometrically unaffected, whereas the 
(021) face exhibits dislocations (Extended Data Fig. 7), consistent 
with the proposed mechanism of crystal dissolution in Fig. 3b-d 
where inhibitor-induced strain alters the + of crystal faces in a ther- 
modynamically unfavourable manner. To quantify the strain of CA 
and HCA binding to COM surfaces, we calculated the average dis- 
placement 6 of atoms in the crystal lattice. Our calculations reveal 
that inhibitor adsorption on the (100) face has a marginal impact on 
6 (see Extended Data Table 1); however, HCA binding to the (021) face 
leads to higher strain (6=0.71 A) compared to CA (6=0.55 A), consistent 
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. a I 4 a dissolving in undersaturated solution (S = 0.5) 
‘ . in the absence of inhibitor. Scale bars, 1 1m. 


with HCAs specificity for inhibiting [021] growth. These findings 
seemingly agree with general observations reported by Wojciechowski 
and co-workers’, who showed that strain induced in ductile crystals 
subjected to constant tensile stress exhibit reduced rates of growth. 

Beyond COM surface interactions, we analysed the coordination 
of organic anions with Ca’* ions in simulations where the number 
of molecules increases (Fig. 3g—j). Comparison of HCA, CA, and Ox 
complexation with Ca** ions (Fig. 3g) reveals binding energies of 
—174kcal mol™!, —144kcal mol, and —114kcal mol}, respectively. 
The corresponding number of Ca?* ions complexed per organic anion 
is 1.5, 1.5, and 1.0. These calculations indicate that HCA and CA display 
a higher affinity for Ca** ion complexation relative to Ox. HCA-Ca”t 
binding is the most energetically favourable, which is attributed to 
increased hydrogen bonding in the complexes, owing to the presence 
of an additional hydroxyl group on HCA compared to CA, in conjunc- 
tion with an observed molecular flexibility of HCA to fold around and 
protect Ca** ions (Fig, 3j and Supplementary Video 5). On the basis of 
these findings, we propose that an inhibitor must satisfy the criterion 
BE inhibitor-crystal > BE solute-crystal (or alternatively BE inhibitor-calcium 
> BEcolute-calcium) in order to induce crystal dissolution. 
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Figure 3 | Mechanism and in silico study 

of inhibitor action. a, Schematic of strain 
imposed by inhibitors that are incorporated 
within an advancing, unfinished layer as a 
result of step pinning. The crystal building unit 
refers to the solute, either calcium or oxalate. 
b, Idealized mode of step growth inhibition by 
kink blocking at low inhibitor concentration. 
c, d, Mechanism of strain-induced surface 
dissolution at moderate and high inhibitor 
concentration, respectively. e, f, Structural 
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The relevance of COM in pathological crystallization (kidney stone 
disease) motivated a detailed study of HCA as a potential replacement 
for potassium citrate, a supplement that is often prescribed to patients 
with calcium oxalate kidney stone disease. CA is a normal component 
of human urine, and is thought to prevent kidney stone formation by 
complexing Ca”* ions and by acting as an inhibitor of COM crystal 
growth. Treatment with alkali (most commonly citrate salts) further 
increases urine CA levels and has been shown to reduce the formation 
of calcium stones; however, as many as 16% of patients in treatment 
trials have discontinued the medication owing to side effects”®. 
Moreover, the past 30 years has witnessed no major advancement in 


conformation of a single HCA molecule 
adsorbed on idealized COM (100) and (021) 
faces (see Extended Data Fig. 7 for results of 
partially relaxed COM crystal surfaces). Atoms 
are coloured to represent hydrogen (white), 
carbon (grey), oxygen (red), and calcium 
(green). g, DFT-calculated binding energy for 
the complexation of fully dissociated HCA, 
CA and Ox molecules with calcium ions 

(see Extended Data Fig. 8 for the calculation of 
CA with —2 charge). Binding energy data are 
scaled by the number of molecules N. Dashed 
lines connecting symbols are added to guide 
the eye. h-j, Structures of organic anion and 
calcium ion complexes (shown for N= 4). 

All binding energy values are exothermic and 
reported in kcal mol". 


HCA 


kidney stone therapy, despite evidence that the incidence rate of kidney 
stone disease is on the rise”’. 

Our findings suggest that HCA has the potential to be an alternative 
to potassium citrate. We performed an in vitro assay to assess the 
effect of HCA and CA on the upper limit of metastability in urine 
samples from eight patients with kidney stone disease. This standard 
assay*” is used to assess the minimum concentration of Ox required 
to induce COM nucleation in the presence of inhibitors. Both HCA 
and CA increased the upper limit of metastability (Fig. 4a) compared 
to the untreated urine control (P < 0.02 for both inhibitors), thus 
confirming their comparable inhibitory effect on crystal nucleation 
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Figure 4 | In vitro assay and urinary excretion of HCA. a, The upper 
limit of metastability in human urine expressed as the calcium oxalate 
concentration product in the presence of inhibitor (Cca = Cuca = 2 mM) 
exceeds that of the control (n = 8 subjects). Solid lines refer to the average 
and the dashed boxes extend 1 s.d. above/below the average. b, Ion 
chromatography of (i) a standard containing 0.2mM CA 

and 0.1 mM HCA, (ii) human urine before treating with isocitrate 
dehydrogenase, and (iii) human urine from a subject taking HCA 


Retention time (min) 


supplement after treating with isocitrate dehydrogenase. Data are shifted 
in the y axis for visual clarity. The inset shows an image of the fruit 
garcinia cambogia, which contains HCA. c, Urinary excretion of HCA 
from oral administration of clinical-grade garcinia cambogia extract in 
seven human subjects, $1-S7. Each sample was run in duplicate (data are 
the averages with coefficients of variation c, < 1.5%). All subjects tolerated 
the short-term dosing protocol with no reported side effects. 
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in the urine milieu. As some fruits, such as Garcinia gummi-gutta 
(common name garcinia cambogia or Malabar tamarind), contain 
HCA, we purchased clinical-grade garcinia cambogia extract for 
human trials to evaluate HCA bioavailability. For HCA to function 
as a kidney stone treatment, it needs to be excreted in urine. HCA is 
not a normal component of urine, which we confirmed by measuring 
HCA concentration in random urine samples in five healthy subjects 
(three men and two women, mean age 33.4 years) not taking HCA 
supplement. The result of this control revealed that HCA was below 
the level of detection (<0.05 mM) in all five subjects. Prior studies have 
documented detectable blood levels of HCA after ingestion, although 
only a fraction of ingested drug is absorbed*!*”. The drug does not 
appear to be metabolized by humans, but urine excretion rates of 
HCA have not been reported previously. To this end, we measured 
urinary excretion of orally ingested HCA in seven non-kidney-stone- 
forming subjects (five men and two women with mean age 44.6 years 
old). The subjects took the recommended dose of the supplement, 
and on the third day of ingestion urine was collected for 24h and 
the excreted HCA was measured by ion chromatography (Fig. 4b). 
As shown in Fig. 4c, the average HCA excretion is 1.1 + 0.6 mmol per 
day (with a mean concentration of 0.7 +0.6mM HCA). 

To summarize, AFM measurements and DFT calculations reveal 
a new thermodynamic mechanism of crystal growth inhibition 
that deviates from classical kinetic models of inhibitor-crystal 
interactions. We show that two organic anions with high affinity 
for binding to COM surfaces induce localized strain on the crys- 
tal lattice. When adsorbed at moderate coverage, these inhibitors 
cause COM crystal surfaces to dissolve in supersaturated solution. 
Our proposed criterion based on the relative binding energies of 
inhibitor and solute with crystal surfaces may prove to be relevant 
for predicting the dissolution of other crystalline materials. The 
comparison of HCA and CA in this study also highlights the subtle 
nuances of rational design wherein a small difference in molecular 
structure (for example, insertion of one alcohol group) can substan- 
tially alter inhibitor specificity and efficacy. Moreover, we report 
that HCA shows promise as a potential therapy to prevent kidney 
stones. HCA may be preferred as a therapy over potassium citrate, such 
as in patients with alkaline urine where a further increase in urine pH 
could cause calcium phosphate stones**; however, better understand- 
ing of HCA metabolism in humans, optimal dosing regimens, and long 
term safety and tolerability are needed before HCA could be studied 
in a prospective clinical trial of kidney stone prevention. 


Online Content Methods, along with any additional Extended Data display items and 
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METHODS 

Materials. The following reagents were purchased from Sigma Aldrich (St Louis, 
MO): calcium chloride dihydrate (ACS Reagent, 99-++-%), sodium oxalate (NazCO4, 
>99%), sodium hydroxide (98%), hydrochloric acid (37%), sodium citrate tribasic 
dihydrate (ACS reagent, >99.0%), and potassium hydroxycitrate tribasic mono- 
hydrate (95%). Sodium chloride (99.9% ultrapure bioreagent) was purchased from 
JT Baker. Deionized water used in all experiments was purified with an Aqua 
Solutions RODI-C-12A purification system (18.2 MQ). All reagents were used as 
received without further purification. 

Garcinia cambogia was purchased from Swanson Health Products (Super 
CitriMax Clinical Strength Garcinia Cambogia Extract, item SWD051). The 
recommended serving size (2 capsules) contained the following components and 
corresponding mass per serving: calcium (120 mg), chromium (130,1g), potassium 
(180 mg), and garcinia cambogia extract (1.5 g) containing HCA (900 mg). 

COM bulk crystallization. Batch crystallization was carried out in a 20-ml glass 
vial by dissolving NaCl in deionized water, then adding 0.7 ml of 10mM CaCl, 
stock solution. A clean glass slide (about 1.3 x 1.3 cm’) was placed at the bottom of 
the vial to collect the crystals for microscopy. The sample vial was then placed in an 
oven set to 60°C for 1h to ensure the solution reached the set point temperature for 
crystallization. Subsequently, 0.7 ml of 10 mM NazC,0, stock solution was added 
to the vial dropwise while continuously stirring at a rate of about 400 r.p.m. To 
investigate the effect of growth inhibitors (that is, CA or HCA) on COM crystalli- 
zation, an appropriate quantity of the inhibitor was added to the growth solution 
before NayC,O, addition. The final growth solution had a composition of 0.7 mM 
CaCly:0.7 mM NazC,04:150mM NaCk:xjg ml“! inhibitor (where x= 0-100) and 
a total volume of about 10 ml. Crystallization was performed at 60°C for 3 days at 
static conditions (that is, without stirring or agitation). The glass slide (substrate) 
was removed from the solution, gently washed with deionized water, and dried at 
room temperature before analysis. The pH of COM growth solution was measured 
before and after crystallization using an Orion 3-Star Plus pH benchtop meter with 
a ROSS Ultra electrode (8102BNUWP). 

Characterization of COM crystallization. The size and morphology of COM 
crystals prepared in the absence and presence of inhibitors was assessed by optical 
microscopy using a Leica DM2500-M instrument. Brightfield images were 
obtained in reflectance mode to quantify the crystal dimensions, which we report 
as length L in the [001] direction, width W in the [010] direction, and the length- 
to-width (L/W) aspect ratio. A minimum of 150 crystals from three separate 
batches were measured to obtain an average aspect ratio for data reported in Fig. Lf. 
COM crystal number density, pcom, was measured as the number of crystals per 
area of glass slide. The pcom values reported in Fig. li are an average of at least ten 
areas on glass slides from three separate batches (each micrograph area is 647 |1m 
x 484,1m). The COM [100] thickness was measured from a combination of 
optical and electron micrographs. Scanning electron microscopy (SEM) was 
performed using a FEI 235 Dual-Beam Focused Ion-beam instrument equipped 
with a SEM sample extraction probe. SEM samples were prepared by gently press- 
ing COM crystals on the glass slide to carbon tape to transfer crystals to the 
sample disk. Each sample was coated with a layer of carbon (about 20 nm thick) 
to reduce the effects of electron beam charging. The average values reported in 
Fig. 1g were obtained from measurements of more than 150 crystals from three 
separate batches. 

The effect of growth inhibitors on the kinetics of COM crystallization was 
measured using a calcium ISE from ThermoScientific equipped with an Orion 
9720BNWP ionplus electrode. ISE measurements track the temporal reduction 
in free calcium ion concentration in the growth solution during crystallization 
(including the effects of both nucleation and crystal growth)*4. Growth solutions 
were prepared similar to COM bulk crystallization, but at room temperature 
using a solution with supersaturation ratio S= 3.8 and a composition of 0.5 mM 
CaCl3:0.5 mM NayC304:150 mM NaCl:x 1g ml" inhibitor (where x= 0-100). For 
ISE studies, we used the calcium oxalate solubility product reported by ref. 35 in 
order to calculate the calcium oxalate supersaturation. ISE measurements were 
performed at room temperature under constant stirring (about 1,200 r.p.m.) to 
minimize the induction time”*"*. Plots of consumed calcium ion concentration as 
a function of time were generated for each inhibitor concentration. A minimum of 
eight measurements were performed for each data point reported in Fig. 1h. The 
data were normalized by subtracting the concentration of the initial time point 
(see Extended Data Fig. 1). The approximate linear slope of these curves during 
the first 40 min of crystallization corresponds to the rate of Ca’* depletion. The 
efficacy of growth inhibitors was determined by the percentage inhibition, which 
was calculated by comparing the change in slope of the growth curve in the pres- 
ence of inhibitor relative to that in the absence of inhibitor (that is, the control). 
Prior to ISE measurements, the electrode was calibrated using a standard calcium 
solution (0.1 M, Orion Ion Plus), which was diluted with deionized water to three 
concentrations: 0.1 mM, 1.0mM and 10.0 mM. The ionic strength of each solution 
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was adjusted using a standard solution (ISA, Thermo Scientific), which was added 
in a 1:50 volume ratio of ISA-to-standard. 

In situ AFM was performed using a Digital Instruments Multimode IV (Santa 

Barbara, CA) to examine topographical images of COM crystals and capture the 
dynamics of surface growth in real time. COM crystals (about 501m in length) 
were mounted on an AFM specimen disk (Ted Pella) covered with a thin layer 
of thermally curable epoxy (MasterBond EP21AOLV), in accordance with pre- 
viously reported protocols****”. The epoxy was partially cured in an oven for 
about 20 min at 60°C. COM crystals on glass slides from bulk crystallization assays 
(in the absence of inhibitor) were immobilized on the partially cured epoxy by 
gently pressing the glass slide to transfer crystals with either their (100) or (010) 
faces oriented normal to the specimen surface. The sample was then placed in an 
oven at 60°C for an additional 3h to completely cure the epoxy. All AFM meas- 
urements were performed using silicon nitride probes with gold reflex coating and 
a spring constant of 0.15N m~! (Olympus, TR800PSA). In situ experiments were 
performed to monitor surface growth in supersaturated calcium oxalate solution. 
We measured the velocity of step advancement and changes in hillock morphology 
on COM (010) and (100) surfaces in the absence or presence of inhibitors. The 
reported step velocity (Fig. 2e) is the average of at least 5 measurements of different 
steps. For in situ AFM measurements, a growth solution with supersaturation ratio 
S=4.1 was prepared, similar to the solution used for ISE measurements, but with 
a composition of 0.18 mM CaCly:0.18 mM NazC204:x jug ml"! inhibitor (where 
x=0-2.1). The inhibitor concentration used in AFM measurements is lower than 
that employed in bulk crystallization””*” owing to fewer crystals (that is, smaller 
COM surface area) on the AFM specimen disk. The AFM instrument was equipped 
with a fluid cell (model MTFML) containing two ports for inlet and outlet flow 
to maintain constant supersaturation during continuous imaging. The growth 
solution was delivered to the liquid cell using a dual syringe pump (CHEMYX, 
Fusion 200) with an in-line mixing configuration*® to combine CaCl, and NazC,04 
solutions with a combined flow rate of 0.2 ml min~!. Inhibitors were introduced 
into the NazC2O, solution at the appropriate concentration (taking into account 
the effect of dilution at the in-line flow connection). Continuous imaging was 
performed in contact mode with a scan rate of 7.0-9.2 Hz at 256 lines per scan. 
In vitro assays of COM crystallization in human urine. The upper limit of 
metastability was measured using a modified version of the method described by 
ref. 30. Urine aliquots were obtained over a 24-h urine collection period from eight 
patients with kidney stones. All urine samples were brought to a pH of 5.7 by the 
addition of potassium hydroxide or HCl as needed. Each urine sample was studied 
with no additive or with either CA or HCA added to increase their concentration 
by 2mM. For each sample, 200,11 of urine was added to 12 wells of a 96-well micro- 
litre plate. Solutions of increasing concentration of Ox were pipetted into the urine 
aliquots in the wells. The plate was placed on a shaker for 3h at 37°C and then 
the turbidity of solutions in each well were measured at 620 nm wavelength using 
a VMax kinetic ELISA microplate reader (Molecular Devices, Sunnyvale, CA). 
The well at which turbidity increased determined the point of crystallization and 
the Ox concentration at this point is the amount of Ox in the urine measured at 
baseline plus the amount of Ox added to the urine in the well showing increased 
turbidity. The results are presented as the calcium oxalate concentration product 
(used as a surrogate of supersaturation) at the point of crystallization (see Fig. 4a). 
In addition to their crystal inhibition activity, both CA and HCA complex 
calcium in solution, lowering the concentration of ionized calcium. Both mecha- 
nisms should contribute to changes in the upper limit of metastability relative to 
the control, indicative of an inhibition of nucleation. Each urine sample was run 
in duplicate and the results were averaged. Statistical comparison was performed 
using the non-parametric Wilcoxon test. 
Human trials of HCA bioavailability. There were no prior reports of measure- 
ments of HCA in human urine in people not consuming HCA supplement. We 
tested the hypothesis that HCA is not a normal constituent of human urine by 
measuring the concentration of HCA in random urine samples from five healthy 
subjects (protocol number 1061857 Western Institutional Review Board). 

The hypothesis of the human trial study was that HCA, when orally admin- 
istered, will be excreted in urine. To test this hypothesis, we assessed urinary 
excretion to confirm the bioavailability of HCA through oral administration. The 
protocol was approved by the University of Houston Internal Review Board (case 
15176-01). Recruitment was limited to subjects between 21 and 65 years of age. 
Pregnant women and subjects with known severe chronic kidney disease (stage 
4 or 5) were excluded. All samples were collected and analysed with informed 
consent. 

The supplement used in the human trial was Super CitriMax Clinical Strength 
Garcinia Cambogia Extract. The active ingredient, HCA, is an inhibitor of ATP 
citrate lyase and is presumed to reduce lipogenesis as its mechanism of action for 
inducing weight loss*!. Each serving (2 capsules) contained 1.5 g garcinia cambogia 
extract with 900 mg active ingredient. The subjects were asked to take garcinia 
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cambogia extract for three days at the dose recommended by the manufacturer 
(that is, two capsules three times a day). On the third day of garcinia cambogia 
treatment, the subjects collected urine for 24h. The urine was collected unrefrig- 
erated using an antimicrobial preservative. 

HCA concentration was measured by ion chromatography using an ICS-2000 

system (Dionex Corp., Sunnyvale, CA) with AS11 guard and analytic columns, 
potassium hydroxide eluent, and a conductivity detection system. Because 
isocitrate co-elutes with HCA on this system, urine samples were pre-treated with 
isocitrate dehydrogenase to remove isocitrate interference. Hydroxycitric acid 
calcium salt, (—)-(P), purchased from ChromaDex Inc. (Irvine, CA) was used 
as a standard. 
DFT calculations. Molecular orbital DFT calculations were performed using 
the Turbomole/6.6 program package*’. We used the BP86 functional**! 
and accounted for dispersion energy corrections through the D3 method”? 
appropriate for capturing hydrogen bonds originating by the presence of HCA 
and CA inhibitors. The BP86 method has been shown to successfully capture the 
aggregation behaviour of metal-cation complexed organic acids**. The resolution 
of identity“! approximation along with multipole accelerated resolution of indices 
(MARI-J)* were used to accelerate the calculations. We used the def2-SV(P) 
basis set*® and accounted for solvent effects through the COSMO continuum 
solvation model (the solvent was water; dielectric constant ¢ = 78.46)*”. The COM 
nanoparticle with a (100) surface termination consisted of 168 atoms (Caz4C4gOoq); 
whereas the one with a (021) termination consisted of 133 atoms (Caj9C3gO7¢). 
The (001)-to-(100) step consisted of 224 atoms (Ca32C¢4Oj2). We have used the 
whewellite (monoclinic) crystal unit cell** to build the COM nanoparticles (with 
H,0 molecules being removed and simulated by implicit solvent effects). The (100) 
and (021) surfaces and (001)-to-(100) step were kept frozen and the inhibitors were 
allowed to fully relax in our calculations. We also performed calculations where the 
surface of COM was allowed to relax and the overall adsorption trends remained 
the same (see partial relaxation calculations in Extended Data Fig. 7). The isomer 
of HCA in garcinia cambogia was taken into account, consistent with the natural 
extract used in human trials“’. The binding energy of inhibitors and oxalate to 
crystal surfaces is defined as: 


BE = Einnibitor+ com — FE inhibitor — Ecom (1) 


where E, represents the total electronic energy of species x. For calculations of the 
‘inhibitor + COM’ species, the deprotonated forms of the acids were placed on the 
COM surfaces and their hydrogens (from the deprotonation) were placed at a posi- 
tion far away from the interaction centre and anti-diametric on the COM nanopar- 
ticle to bring the system to a neutral charge state and avoid calculations under high 
charge states (that is, charge counterbalance). The energy of the inhibitor in the gas 
phase, labelled as Einhibitor corresponds to molecules in their protonated (neutral) 
state. Analogously to the previous expression, the binding energy of the complexes 
used in determining the affinity of HCA, CA and Ox for Ca?* is defined as: 


d d 
BE complex _ Ecomplex T nBiia = Wes — nE inhibitor (2) 


where 7 represents the number of organic molecules in the complex, d is the depro- 
tonation state of the inhibitor (that is, d= 1 corresponds to single, d=2 to double, 
and d=3 to triple deprotonated states, equivalently), and E, represents the elec- 
tronic energy of species x. The inhibitor is in the protonated form in the gas phase 
and we use H; as a reference state for the hydrogen of the acids. Approximately four 
different initial conformations were taken into consideration for each inhibitor- 
surface and complexation calculation, wherein we report the lowest-energy 
conformations. Our obtained minima from the optimization calculations were 
further validated by the absence of any imaginary frequencies on the complexes 
and on the inhibitors interacting with COM crystal surfaces. 

To quantify crystal lattice strain, we employ a geometric comparison between 
the frozen and relaxed structures of COM surfaces in the presence and absence 
of adsorbates (growth inhibitors). We developed a distance metric to quantify the 
displacement of relaxed atoms (relative to their frozen state) on COM crystal sur- 
faces by taking the average absolute value of the (x, y, z) coordinate displacements. 
The average displacement 6 is represented as: 


Nat 2 2 2 
= pe al | @etrozen = Xi,relaxed) a (Yj frozen —Viwigcea) + (Zi,fcozen _ Zi,relaxed) 


6 (3) 


Natoms 


where Natoms represents the number of atoms that are relaxed on the surface of the 
COM crystallographic plane. 

CA and HCA speciation. The pH of crystallization media impacts the net 
charge of the inhibitor. Both CA and HCA have three dissociation constants cor- 
responding to each of their carboxylic acid groups. In physiologically relevant 


environments, such as the kidney, the pH of urine varies between 5 and 8 (ref. 50). 
In vitro COM crystallization assays are performed at approximately neutral pH. For 
instance, the growth solutions employed in this study have pH = 6.2 + 0.2, which 
is identical to previous in vitro assays published by Rimer and co-workers??3437"1, 
As shown in Extended Data Fig. 2a, equilibrium calculations at pH 6.2 predict 
HCA to be predominantly in the fully dissociated state (that is, net charge = —3) 
while approximately 40% of CA species are in the fully dissociated state, labelled 
as CA*” or CsH507*~. 

The acid-base speciation reactions for CA along with the respective dissociation 
constants, pK,;, are the following: 


C0) "caer ent (4) 
Gane = cro (5) 
CH 02- 22" caa,0%- + Ht (6) 


The reported pK, values were obtained from Martell and Smith’. There are 
discrepancies in the reported pK,, value owing to the ionic strength dependency 
of the dissociation constant. For instance, some references report pK,,=5.66 when 
the ionic strength is 100 mM (ref. 52), which is less than the ionic strength of 
calcium oxalate solutions used in ISE and bulk crystallization assays. On this basis, 
it is likely that the dominant species in calcium oxalate growth solutions is CA?~. 
DFT calculations of CA-crystal interactions and CA-calcium ion complexes in 
the manuscript were performed using CA; however, equivalent DFT calculations 
were performed with CA’. The results of these calculations, which are presented 
in Extended Data Figs 6 and 8, reveal that the conclusions reached in the manu- 
script are not altered by CA charge. Moreover, we conducted bulk crystallization, 
ISE, and in situ AFM measurements in growth solutions at pH 6.2 (nominal 
condition) and pH 8.0 to assess the influence of CA charge on its efficacy as a COM 
crystal inhibitor. At pH 8.0, approximately 99.8% of CA species are in the fully 
dissociated state (neglecting the effect of ionic strength). Bulk crystallization assays 
reveal approximately no change in crystal morphology and size at these two pH 
values. Similarly, in situ AFM measurements at pH 6.2 and 8.0 at Cc, =0.1pg ml! 
showed no apparent change in etch pit formation on COM (100) surfaces. ISE 
experiments at pH 6.2 and 8.0 reveal subtle differences in the percent inhibition 
of COM growth at Coa = Cuca = 201g ml“! (Extended Data Fig. 2b). The percent 
inhibition of COM growth is slightly reduced at higher pH, but HCA is the most 
effective inhibitor irrespective of solution alkalinity. 

The speciation reactions and corresponding equilibrium dissociation constants 
for HCA are the following”: 


CHO. Cao, bt (7) 
Cay Cr Sat (8) 
C,H,02- 224 cg,03- + Ht (9) 


As shown in Extended Data Fig. 2a, the percentage of fully dissociated HCA 
(labelled as HCA*~ or CeHsOg° ) at pH 6.2 is 92%. At pH 8.0 (that is, the upper 
limit of urine), the percentage of HCA?~ is nearly 100%. As such, DFT calculations 
in the manuscript were performed using the dominant species, HCA. 

COM (100) surface dissolution in the presence of CA and HCA. In Extended 
Data Fig. 4 we provide a detailed analysis of etch pit formation on a COM (100) 
surface in the presence of Cyca = 0.25 1g ml-!. Time-resolved in situ AFM images 
at periodic times reveal the formation and growth of etch pits (Extended Data 
Fig. 4a—c). For each dashed line in the AFM image, the corresponding height profiles 
are provided in Extended Data Fig. 4d-f, respectively. The nominal growth of hill- 
ocks on the (100) surface in the absence of inhibitor (Extended Data Fig. 4d) occurs 
by the advancement of single steps. Each step has an average height of 0.4nm, 
which is the approximate size of the unit cell parameter in the a direction. Upon 
the addition of HCA, etch pits monotonically evolve in both depth (Extended Data 
Fig. 4g) and width (Extended Data Fig. 4h) with imaging time. The concentration 
of inhibitor in this study is approximately three orders of magnitude less than the 
concentration of Ca”* ions in supersaturated solution, indicating the effect of etch 
pit formation is not solely attributed to reduced calcium oxalate supersaturation as 
a result of inhibitor-calcium ion complexation. 

Mechanism of HCA-induced dissolution of COM crystals. Common mecha- 
nisms of crystal growth inhibition include step pinning and kink blocking**™. 
The former involves the adsorption of inhibitors on terraces or step edges, 
which impede the advancement of steps. Inhibitors adsorbed with their average 
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adsorbate-to-adsorbate spacing comparable to the critical radius of curvature r- 
for the step (that is, high surface coverage at high inhibitor concentration) impedes 
step advancement, leading to reduced step velocity and ultimately suppressed 
growth when the step radius approaches r- (refs 54 and 55). The mode of action 
for HCA on COM crystallization at low inhibitor concentration, as inferred from 
time-resolved AFM images of the COM (010) surface, does not appear to be step 
pinning owing to the absence of protrusions on steps (which is a characteristic trait 
of inhibitors that operate by a step-pinning mode of action™). AFM measurements 
of COM (100) surface growth at high concentration of either CA or HCA reveal 
the formation of irregular-shaped steps, which may indicate that inhibitors bind 
to step edges on this surface (consistent with a previously proposed mechanism 
for CA inhibition”!). 

Kink blocking is commonly observed when the crystal grows by a screw dis- 
location mechanism***” wherein inhibitors adsorb to kink sites and reduce the kink 
density and extend the critical length of the step'*. The net result is a decreased rate 
of step advancement within the plane of measurement, as well as reduced growth 
of hillocks in the direction normal to this plane. AFM studies of COM growth 
have shown that steps emanating from screw dislocations on the (100) and (010) 
surfaces advance across the crystal plane. Burton, Cabrera and Frank” derived a 
theoretical model of spiral growth predicting the rate of crystal growth normal to 
the basal face, Gjz41, as follows: 


ao (vi) akin _h 


(y,) nk T a 


where hq; is the height of (hkl) steps advancing along the COM (100) plane, y; is 
the interstep distance, and v; is the velocity of step advancement**. The subscript 
refers to the ith edge of growth hillocks, which advance across the surface in spiral 
patterns with a characteristic rotation time 7. Observations of decreased COM 
[100] thickness in Fig. 1g are qualitatively consistent with this theoretical equation. 

It is energetically more favourable for inhibitors to adsorb on kink sites rather 
than on step edges. Both modes of action can alter the shape of crystals through 
specific inhibitor—step or inhibitor-kink interactions. In our studies of COM 
crystallization, the mode of action for CA and HCA at low inhibitor concentra- 
tion inferred from in situ AFM images does not appear to be kink blocking. To 
quantitatively analyse AFM data, we constructed Bliznakov plots, which repre- 
sent the relationship between relative step velocity and inhibitor concentration. In 
Extended Data Fig. 5a we plot a hypothetical v/vp trend with increasing inhibitor 
concentration as a function of calcium oxalate supersaturation where v and vo 
are step velocities in the presence and absence of inhibitor, respectively. For the 
step-pinning mode of action, this plot should exhibit a monotonic decrease in v/vo 
with increasing inhibitor concentration (see ref. 8 for example). Plots for HCA 
(Extended Data Fig. 5b) deviate from this trend, suggesting that HCA does not 
bind to the (010) surface. For inhibitors that operate in a kink-blocking mode of 
action, a plot of vo/(vo — v) with increasing inverse inhibitor concentration should 
produce a linear trend; however, plots for HCA exhibit no apparent trend, which 
leads us to believe that HCA preferentially binds to steps, as illustrated in Fig. 3c. 
Note that the plateau region for v (Fig. 2e) observed at low inhibitor concentration 
(Cuca < 0.04j1g ml“) has also been reported by others examining the effects of 
proteins and polyamino acids on COM (010) and (100) step advancement*®. In 
these studies, it was suggested that the size of the step relative to the size of the 
inhibitor can lead to complex sorbate-crystal binding (including the possibility of 
cooperative effects among multiple HCA molecules at step sites). 

Inhibitor complexation of free calcium ions in solution reduces the rate of COM 
growth by lowering the supersaturation. The solubility of COM crystals C, (that is, 
equilibrium concentration of solute) is as follows: 


Cs= Kp Vea Vox 
where K,, is the solubility product (1.66 x 10~° mol? I~? at 25°C)*° and 7; is the 
activity coefficient (i= Ca or Ox). For the equimolar growth solutions used in 


this study (Cca= Cox), the activity coefficients for Ca and Ox are calculated using 
the expression: 


(11) 


(12) 


where zis ion valence and J is ionic strength. The addition of either HCA or CA to 
a calcium oxalate growth solution lowers supersaturation by (i) forming complexes 
with free calcium ions to reduce C¢,, and (ii) increasing the ionic strength, which 
alters the activity coefficients, thereby increasing the solubility of COM crystals. 
Inhibitor concentrations used in AFM experiments (about 31M) are insufficient 
to reduce supersaturation by complexation (that is, Ca7*/HCA ~ 10°), and impose 
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only minor changes in solubility due to activity effects (that is, C, increases by 
about 0.05%). If the inhibitor concentration in bulk crystallization is increased to 
values that are commensurate with supersaturation, the effects of complexation 
and activity are important. For example, we performed bulk crystallization in the 
presence of about 0.3mM HCA (or 100,1g ml!) and observed few COM crystals, 
indicating almost complete suppression of COM crystallization. 

In the literature, alternative mechanisms have been proposed to describe the 
potential effects of inhibitors (or impurities) on crystal solubility. At low super- 
saturation, it is suggested that inhibitors can increase local solubility at crystal 
interfaces, potentially inducing dissolution when the mother liquor is close to 
saturation®!. Under similar conditions, it is also possible to observe the effects 
of Ostwald ripening, where small crystals dissolve at the expense of larger par- 
ticles that grow based on the Gibbs-Thompson effect®’. Others have proposed 
that changes in surface free energy can be caused by the site occupancy in crystal 
lattices with embedded solid particles or impurities®’, which induce stress on the 
crystal (analogous to the simulations by ref. 26 described in Fig. 3a). Additionally, 
the formation of defects on crystal surfaces (such as vacancies or dislocations) 
can induce stress on the crystal lattice, leading to reduced rates of growth under 
supersaturated conditions™. 
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Extended Data Figure 1 | Examples of ISE measurements. In situ ISE 
measurements of COM crystallization in the presence of CA (a) and 
HCA (b) at concentrations of 01g ml~!, 20,.g ml}, 40 pg ml}, 

60g ml~!, 80g ml“! and 100,1g ml“!. The y axis is the quantity of 
free calcium ions in the growth solution that are consumed during 
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crystallization. Linear regression of each curve provides the rate of crystal 
growth. The percentage inhibition of COM crystallization is obtained 

by comparing the slopes of ISE curves in the presence of inhibitor (filled 
symbols) to the slope in the absence of inhibitor (open diamonds). ppm, 
parts per million. 
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Extended Data Figure 2 | Inhibitor speciation and its effect on COM in supersaturated calcium oxalate solution (S = 3.8) in the presence of 
growth. Here we compare calculations of inhibitor speciation with ISE Cca = 201g ml“! (orange bars) and Cuca = 20g ml! (blue bars). The 
measurements of COM percentage inhibition at pH 6.2 (solid bars) and percentage inhibition of COM crystal growth slightly decreases at higher 
pH 8.0 (patterned bars). a, Percentage of deprotonated CA and HCA alkalinity. HCA is the more effective inhibitor irrespective of solution pH. 
species, calculated from equations (4)-(9). Fully dissociated species Data are the average of more than 10 measurements (error bars are 1 s.d.; 


(charge —3) are represented by white bars and partially dissociated species | P < 0.05 comparing HCA to CA at both levels of pH). 
(charge —2) are represented by grey bars. b, Results of ISE measurements 
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Extended Data Figure 3 | Optical micrographs of COM crystals. Optical micrographs of COM crystals after heating a growth solution for 3 days at 60°C. 
Here we compare the control sample in the absence of growth inhibitor (a) to solutions prepared with Cca = 20g ml~! (b) and Cyca = 20p1g ml! (c). 
Scale bars, 100 1m. 
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Extended Data Figure 4 | HCA-induced etch pits on the COM (100) 
surface. a-c, Time-resolved images during in situ AFM measurements 

of a COM (100) crystal surface in the presence of HCA. The surface is 

first imaged in the absence of inhibitor (a) and then at Cyca =0.25 1g ml! 
(b and c). The elapsed time between each deflection mode image is 
approximately 4 min. d-f, Height (or depth) profiles corresponding to 

the dashed lines in a-c, respectively. As shown in a and d, the COM (100) 
surface before the addition of HCA is comprised of single steps with height 
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approximately 0.4 nm (see inset in d), which approximately corresponds 
to the unit cell dimension in the [100] direction (a=0.6nm)**. Depth 
profiles in e and f show the temporal evolution of a single etch pit. 
Quantitative analysis of etch pit dimensions with respect to depth d (g) 
and width w (h) reveal monotonic changes during 10 min of continuous 
AFM imaging. Schematics of an etch pit (left inset in g) with highlighted 
depth (right inset in g) and height (inset in h) are shown to aid 
visualization. 
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Extended Data Figure 5 | Construction of Bliznakov plots. a, Theoretical direction (purple squares) and [021] direction (blue diamonds) as a 
Bliznakov plot for crystal inhibitors that follow a step-pinning mode of function of increasing Cyca. The deviation of experimental data from 
action as a function of increasing calcium oxalate relative supersaturation theoretical trends suggests that step pinning is not the dominant 
o (derived from ref. 65). b, Plots generated from in situ AFM data on mechanism by which HCA inhibits COM surface growth. 
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Extended Data Figure 6 | Binding energy of inhibitors on the COM binding to a (001) step on the COM (100) surface. For these calculations, 
(100) surface. The results of DFT calculations showing the adsorption the surfaces are kept frozen (that is, unrelaxed). Atoms are coloured 
configuration and binding energy of CA?” (a), CA?” (b), HCA?” (c) as follows: hydrogen (white), carbon (grey), oxygen (red) and calcium 
and Ox?~ (d) on the COM (100) surface and of HCA?~ (e) and CA?" (f) (green). 
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Extended Data Figure 7 | CA and HCA interactions with relaxed face changes by +18.8 kcal mol”! owing to HCA adsorption compared 


COM surfaces. Superimposed structures of CA and HCA interacting with _to the +28.1 kcal mol! energy change of the (021) face (positive signs 
unrelaxed (coloured balls and sticks) and partially relaxed (yellow sticks) are endothermic and energy values correspond to the difference between 
surfaces of COM crystals. Side-view snapshots depict CA interaction single point energy calculations of the COM surface with inhibitors 

with (100) (a) and (021) (b) surfaces and HCA interaction with (100) (c) removed). The corresponding values for the total energy change of the 

and (021) (d) surfaces. Atoms are coloured as follows: hydrogen (white), (100) and (021) faces from the presence of the CA are + 14.2 kcal mol7! 
carbon (grey), oxygen (red) and calcium (green). The (100) surface is and +25.7 kcal mol, respectively. The partial relaxation of COM surfaces 
practically unaffected by the presence of HCA, whereas the (021) surface compared to unrelaxed surface calculations (Extended Data Fig. 6) does 


shows dislocations due to strain induced by the high binding affinity of the _ not alter the overall trend in inhibitor-crystal binding affinity. 
inhibitor (see also Extended Data Table 1). The total energy of the (100) 
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Extended Data Figure 8 | Complexation of organic acids with calcium. DFT-calculated binding energy (scaled per number of molecules, N) for the 
complexation of organic anions HCA*~, CA?~ and Ox?” with calcium ions. Note that the data for HCA*~ and Ox’ are identical to Fig. 3g, and are 
merely placed here for direct comparison with CA*~. Dashed lines connecting symbols are added to guide the eye. 
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Extended Data Table 1 | Quantification of lattice strain in the presence of HCA and CA 


Average displacement on COM Average displacement on COM 
(100), 5 (values in A) (021), 5 (values in A) 


Pristine 0.17 0.38 
CA(-3) 0.25 0.55 
HCA(-3) 0.25 0.71 


Average displacement of (100) and (021) surface sites during partial surface relaxation where the pristine surfaces (absence of modifier) are compared to surfaces with 
one adsorbed molecule of HCA or CA, calculated relative to the frozen surface. 
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An integrated design and fabrication strategy for 
entirely soft, autonomous robots 


Michael Wehner!**, Ryan L. Truby!?*, Daniel J. Fitzgerald'?, Bobak Mosadegh*"+, George M. Whitesides”, 


Jennifer A. Lewis? & Robert J. Wood)? 


Soft robots possess many attributes that are difficult, if not 
impossible, to achieve with conventional robots composed of rigid 
materials'. Yet, despite recent advances, soft robots must still be 
tethered to hard robotic control systems and power sources?"!°, 
New strategies for creating completely soft robots, including soft 
analogues of these crucial components, are needed to realize their 
full potential. Here we report the untethered operation of a robot 
composed solely of soft materials. The robot is controlled with 
microfluidic logic!’ that autonomously regulates fluid flow and, 
hence, catalytic decomposition of an on-board monopropellant fuel 
supply. Gas generated from the fuel decomposition inflates fluidic 
networks downstream of the reaction sites, resulting in actuation’”. 
The body and microfluidic logic of the robot are fabricated using 
moulding and soft lithography, respectively, and the pneumatic 
actuator networks, on-board fuel reservoirs and catalytic reaction 
chambers needed for movement are patterned within the body via a 
multi-material, embedded 3D printing technique'*"*. The fluidic and 
elastomeric architectures required for function span several orders 
of magnitude from the microscale to the macroscale. Our integrated 
design and rapid fabrication approach enables the programmable 
assembly of multiple materials within this architecture, laying the 
foundation for completely soft, autonomous robots. 

Soft robotics is a nascent field that aims to provide safer, more robust 
robots that interact with humans and adapt to natural environments 


b d 


Figure 1 | Fully soft, autonomous robot assembly. a, b, A microfluidic 
soft controller is pre-fabricated (a) and loaded into a mould (b). 

c-e, Matrix materials are poured into the mould (c) and fugitive and 
catalytic inks are EMB3D printed (d, e). Scale bar in e, 10 mm. f, After 
matrix curing, the printed fugitive inks auto-evacuate, yielding open 


better than do their rigid counterparts. Unlike conventional robots 
composed of rigid materials, soft robots based on hydrogels'*'®, 
electroactive polymers’, granular media!® and elastomers”!? exhibit elas- 
tic moduli ranging from 10 kPa to 1 GPa (ref. 1), are physically resiliant””° 
and have the ability to passively adapt to their environment’*”. 
Moulded and laminated elastomers with embedded pneumatic net- 
works are widely used materials in soft robotics”. Actuation of 
these elastomeric composites occurs when interconnected channels 
that make up the pneumatic network are inflated with incompressible 
fluids or gases supplied via tethered pressure sources’. Robotic end 
effectors with bioinspired!° and rapid’ actuation, deployable crawlers*” 
and swimmers® with complex body motions, and robust jumpers””* 
have been developed on the basis of this design strategy. However, in 
each case, these robots are either tethered to or carry rigid systems for 
power and control, yielding hybrid soft-rigid systems*’?. 

Creating a new class of fully soft, autonomous robots” is a grand 
challenge, because it requires soft analogues of the control and power 
hardware currently used. Recently, monopropellant fuels have been 
suggested as a promising fuel source for pneumatically actuated soft 
robots*!”. Their rapid decomposition into gas upon exposure to a cata- 
lyst offers a strategy for powering soft robotic systems that obviates the 
need for batteries or external power sources. Here, we report a method 
for creating a completely soft, pneumatic robot—the ‘octobot —with 
eight arms that are powered by monopropellant decomposition. 
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channels. Scale bar, 2mm. g, The octobot is removed from the mould and 
inverted to reveal a fully soft, autonomous robot that is controlled via the 
embedded microfluidic soft controller and powered by monopropellant 
decomposition. Scale bar, 10 mm. Fluorescent dyes have been added in 

e and g to assist in visualization of internal features. 
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Figure 2 | Multi-material, EMB3D printing. a, The octobot features 
include (1) the body matrix, (2) the fuel reservoir matrix, (3) printed 
fuel reservoir traces, (4) fugitive plugs in the soft controller, (5) printed 
platinum reaction chambers, (6) printed pneumatic networks, (7) printed 
vent orifices, (8) printed actuators and (9) moulded hyperelastic actuator 
matrix. All printed features are composed of the fugitive ink except the 
printed platinum reaction chambers (5), which are patterned using the 
catalytic ink. b, The storage modulus, G’, of the fugitive ink, catalytic ink, 
body matrix and fuel reservoir matrix as a function of shear stress. The 
plateau moduli of the inks are an order of magnitude higher than those 

of the matrix materials. c, Trace widths of the fugitive and catalytic inks 
printed at 450 kPa and 345 kPa, respectively, decrease with print speed. 
Error bars indicate the standard deviation for n = 3 measurements. 

d, Optical images of channel cross-sections printed at speeds of 0.5mm s~ 
and 10mms_', which demonstrate that trace dimensions can be changed 
on-the-fly. Scale bars, 100 1m. e, f, Reaction chambers printed with the 
catalytic inks contain a platinum-laden plug, as shown in a cross-section 

(e; scale bar, 500}1m) and a scanning electron micrograph (f; scale bar, 251m). 


1 


To accomplish this, we use microfluidic logic'! as a soft controller and 
a multi-material, embedded 3D (EMB3D) printing method to fabricate 
pneumatic networks within a moulded, elastomeric robot body. Our 
hybrid assembly approach allows one to seamlessly integrate soft lithog- 
raphy, moulding and 3D printing to rapidly and programmably fabri- 
cate a range of materials and functional elements in the form factors 
that are required for autonomous, untethered operation ofa soft robot. 

To fabricate an octobot, we first micro-mould'!”® the soft controller 
that houses the microfluidic logic necessary for controlling fuel decom- 
position (Fig. 1a). The soft controller, which is protected temporarily 
with a polyimide mask, is placed into a mould that is partially filled by 
hyperelastic layers, which are needed for actuation (Fig. 1b). Matrix 
materials are then poured into the mould (Fig. 1c) and the remain- 
ing soft robot features are EMB3D printed into the moulded matrix 
(Fig. 1d, e, Supplementary Video 1). After the matrix materials are 
cross-linked, the aqueous inks ‘auto-evacuate’ at elevated temperature 
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Figure 3 | Octobot control logic. Discrete sides are shown in red and blue 
for clarity. a, A system of check valves and switch valves within the soft 
controller regulates fluid flow into and through the system. The letters 

of ‘VERITAS; each with a height of 500 1m, are patterned into the soft 
controller as an indication of scale. b, A schematic (top) and qualitative 
electrical analogy (bottom) of the octobot system; check valves, fuel 
tanks, oscillator, reaction chambers, actuators and vent orifices are akin to 
diodes, supply capacitors, electrical oscillator, amplifiers, capacitors and 
pull-down resistors, respectively. c, Conceptual curves show key variables 
as a function of time. (1) Nominal pressure drives fuel through system 

at a decreasing rate. (2) Pinch valves in the oscillator convert upstream 
flow into alternating flow between red and blue channels. Flow rate and 
switching frequency are functions of upstream pressure and downstream 
impedance. (3) When upstream pressure is too low, oscillation is not 
possible, so both sides flow at a reduced rate. (4) Catalyst decomposes fuel, 
yielding pressurized gas, which flows downstream to the actuators and 

the vent orifices concurrently. (5) Actuators deform (6, actuator tip angle) 
as the pressure changes. Vents must be sufficiently small to allow full 
actuation, yet sufficiently large to allow timely venting. 
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as water evaporates and diffuses through the matrix, leaving behind 
an open network of channels that are interfaced with the soft control- 
ler (Fig. 1f). Octobot fabrication is completed upon removal of excess 
matrix material (Fig. 1g). A more detailed description of this multi-step 
assembly process is provided in Extended Data Fig. 1. 

By combining micro-moulding with EMB3D printing, we rapidly 
patterned the required mesofluidic networks by extruding a fugitive 
ink, which is subsequently removed via auto-evacuation, through a 
fine nozzle that is embedded within the uncured elastomer matrix. To 
self-heal crevices that form within the ‘body’ matrix as the nozzle is 
translated during the printing process, we created a new elastomeric 
material containing fumed silica nanoparticles that exhibits thixotropic 
behaviour”® (Extended Data Fig. 2a). When completely restructured 
or at rest, this matrix behaves like a Herschel-Bulkley fluid; that is, it 
exhibits both shear-thinning behaviour (Extended Data Fig. 2b) anda 
shear yield stress (Extended Data Fig. 2c). These properties ensure that 
the extruded inks remain in place within the matrix'>'*, However, upon 
yielding, the body matrix readily flows (Extended Data Fig. 2c) into any 
crevices formed. The body matrix restructures with time, ultimately 
recovering its original viscoelasticity (Extended Data Fig. 3), which 
ensures that EMB3D printing can be repeated later in the same matrix 
region. We also created a ‘fuel reservoir’ elastomeric matrix, into which 
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Figure 4 | Octobot actuation. a, Two-bladder actuator design in which 
traces (i) are printed in contact with the hyperelastic layer (ii) inside the 
body matrix material (iii) and differences in modulus result in bending 
upon inflation. The thickness, h, of the hyperelastic layer is modified 

to change the characteristics of the actuator. In this example, the body 
matrix material (iii) has a height of 800 jm. b, Top, the actuator tip angle, 
0, changes upon inflation. Scale bar, 10 mm. Bottom, mean displacement 
angle, 0, taken from three representative actuators during five inflation 


Inflate 
> State 1 
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fuel reservoir channels are printed. Both the body and fuel reservoir 
matrices are cross-linked within the mould after printing is completed. 

To create the fuel reservoirs, catalytic reaction chambers, actuator 
networks and vent orifices, two hydrogel-based inks (fugitive and 
catalytic) are EMB3D printed within the moulded matrix materials 
(Fig. 2a). These printed features are interfaced with each other as well 
as with the soft controller through the use of ‘fugitive plugs’ introduced 
at the inlets of the controller before filling the mould with the matrix 
materials. The fugitive ink is composed of an aqueous, poly(ethylene 
oxide)-b-poly(propylene oxide)-b-poly(ethylene oxide) triblock copol- 
ymer (Pluronic F127) gel'?’”. The catalytic ink contains platinum 
particles (Supplementary Video 2) suspended in a mixture of Pluronic 
F127-diacrylate (F127-DA) and poly(ethylene glycol) diacrylate 
(PEG-DA) that is photo-cross-linked after printing. The rheological 
properties of both inks are specifically tailored for EMB3D printing!?4 
(Fig. 2b, Extended Data Fig. 4). The printed features produced from both 
inks can be changed ‘on-the-fly’ by varying the print speed (Fig. 2c). 
Typically, this fugitive ink must be removed or evacuated after print- 
ing to yield open channels!*”’. However, we find that fugitive ink 
composed of pure Pluronic F127 can be auto-evacuated by heating 
the printed features within the cross-linked, silicone-based matrices 
at 90°C (ref. 28; Fig. 2d, Extended Data Fig. 5). As water evaporation 


Blue flow C 


—e h= 1,500 um 
=e h = 1,250 um 
—e— h = 1,000 um 


State 2 
Red flow 


cycles as a function of inflation pressure, for varying hyperelastic layer 
heights, h (in jm). Error bars, denoted by the shaded regions, indicate 

the 95% confidence interval. c, The oscillator of the soft controller 

causes an octobot to alternate between blue and red actuation states. 

The monopropellant fuel is dyed to show states. Scale bar, 5 mm. d, Stills 
from top-down (top; Supplementary Video 5) and face-on (bottom; 
Supplementary Video 6) operation videos show an octobot autonomously 
alternating between blue (‘1’) and red (‘2’) actuation states. Scale bars, 10 mm. 
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ensues, the triblock copolymer species either form a thin coating at the 
matrix-open channel interface or may partially diffuse into the matrix. 
The fugitive plugs within the inlets of the soft controller also undergo 
this auto-evacuation process, facilitating connectivity between the 
microfluidic logic and all printed mesofluidic components (Extended 
Data Fig. 6). By contrast, the catalytic ink is cross-linked in place after 
printing, yielding a platinum-laden plug within the matrix (Fig. 2e). 

To achieve the desired autonomous function, we incorporated a soft, 
microfluidic controller within the octobot (Fig. 3a). The control system 
is roughly divided into four sections: upstream (liquid fuel storage), 
oscillator (liquid fuel regulation), reaction chamber (decomposition 
into pressurized gas) and downstream (gas distribution for actuation 
and venting). Upstream, 0.5 ml of fuel is infused via a syringe pump 
into each of two fuel reservoirs printed into the hyperelastic matrix. 
Upstream check valves in the soft controller prevent fuel from flow- 
ing back out the fuel inlets. The fuel reservoirs expand elastically to 
a pressure of approximately 50 kPa, forcing fuel into the oscillator. 
The oscillator includes a system of pinch and check valves based on 
prior designs!!, which convert pressurized fuel inflow into alternat- 
ing fuel outflow. With one channel temporarily occluded, fuel from 
the other channel flows from the outlets of the soft controller into 
the platinum-laden reaction chambers, where it rapidly decomposes. 
The resulting pressurized gas, which is prevented from returning to 
the soft controller via downstream check valves, flows into one of the 
downstream mesofluidic networks consisting of four actuators and one 
orifice. The supplied pressure deflects the actuators and exhausts to 
atmosphere through the vent orifice. Therefore, for robust actuation 
and timely venting, a balance must be reached between supply gas flow, 
actuation pressure and exhaust rate. These subcomponents operate on 
the basis of the interaction and timing of the local pressures, which is 
conceptually similar to an electrical oscillator (Fig. 3b). Upon successful 
venting, the fuel flow into one reaction chamber stops and flow to the 
other begins, initiating a similar sequence in the other downstream 
catalytic chamber and actuator network (Fig. 3c). 

To provide an on-board power source, we use 50 wt% aqueous hydro- 
gen peroxide as the fuel, owing to its high energy density (1.44 kJ g~! as 
compared to 0.1-0.2kJ gt for batteries) and its benign decomposition 
by-products. As the fuel decomposes in the presence of the platinum cat- 
alyst, the following reaction occurs: 2H2O; (1) — 2H20 (1, g) + O2 (g). 
This reaction results in volumetric expansion by a factor of approxi- 
mately 240 (at ambient pressure)”. At our operating pressure of 50 kPa, 
an expansion to 160 times the original volume is expected. Although 
higher fuel concentrations would provide increased expansion and 
energy density, concentrations above 50 wt% drastically increase the 
decomposition temperature, resulting in combustion within the printed 
catalytic reaction chambers (Supplementary Videos 3 and 4). Because 
this monopropellant liquid fuel can be handled in small volumes and 
decomposes at the point of use, we can use microfluidic logic to directly 
handle the fuel, eliminating the need for external valves’ to control gas 
at high pressure and flow rate. 

The geometry of the microfluidic soft controller is designed to oper- 
ate at a fuel flow rate of about 401 min“, thereby yielding pressurized 
gas at a rate of about 6.4ml min (ref. 11). Under these operating con- 
ditions, the theoretical run time of 12.5 min could be achieved using 
a system with a fuel capacity of 1 ml. The actuators, which consist of 
printed bladders in contact with a lower-modulus, hyperelastic elasto- 
mer layer (Fig. 4a), are designed to inflate asymmetrically to generate 
angular displacement. Their maximum working pressure and dis- 
placement are tuned on the basis of the thickness of the hyperelastic 
layer (Fig. 4b, Extended Data Fig. 7). If this layer is too thin, then it 
ruptures prematurely. However, the working pressure increases with 
thickness. As a compromise, we selected a layer thickness of 1,000 1m 
because it affords consistent performance at the lowest working pres- 
sure. In parallel with the actuators, we tailored the diameter of the vent 
orifices by modulating print speed. Orifices roughly 75 |1m in width 
allowed proper actuator displacement with timely subsequent venting. 
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The ability to rapidly pattern and adjust the geometry of these features 
on-the-fly via EMB3D printing allowed us to iterate through more 
than 30 designs and nearly 300 octobots to converge on an appropriate 
system-level architecture. 

Through this iterative process, we created octobots with embedded 
components that work together in concert to alternate between the 
red and blue actuation states shown in Fig. 4c. The resulting octobots 
operated autonomously (Fig. 4d, Extended Data Fig. 8, Supplementary 
Videos 5 and 6), cycling between actuation states for four to eight min- 
utes. Although this is less than the predicted theoretical run time, the 
soft controller alternates actuation states as expected. We believe that 
downstream impedances arising from decomposition-actuation- 
venting cycles, as well as the decreasing flow rate of fuel into the soft 
controller with time, are responsible for the departure from theo- 
retical performance!'. These issues can be addressed by integrating 
more sophisticated microfluidic circuits, such as universal logic gates, 
or components with ‘gain’ that enable advanced control schemes 
(see Methods for an extended discussion). 

We have demonstrated the untethered operation of a robot com- 
posed solely of soft materials. The coupling of monopropellant fuels 
and microfluidic logic allowed us to power, control and realize auton- 
omous operation of these pneumatically actuated systems. Through 
our hybrid assembly approach, we both constructed the robot body 
and embedded the necessary components for fuel storage, catalytic 
decomposition and actuation to enable system-level function in a rapid 
manner. The octobot is a minimal system designed to demonstrate 
our integrated design and fabrication strategy, which may serve as a 
foundation for a new generation of completely soft, autonomous robots. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Soft controller fabrication. Soft controllers are fabricated from Sylgard 184 PDMS 
(Dow Corning Corp.) using soft lithography moulding and bonding techniques. 
First, a mould was fabricated on a silicon wafer using SU-8 negative photoresist 
(Microchem Corp.). SU-8 3050 photoresist was used to achieve 100-1m film thick- 
ness. Baking, exposing and developing steps were performed in accordance with 
product specifications in the product datasheet. The completed wafer is placed in 
a Petri dish to form a competed mould assembly. 

Soft controllers consist of an upper mould, a lower mould and an intermediate 

thin film. The upper and lower moulds are made on one wafer to ease fabrication. 
PDMS is poured into the mould assembly to a height of 1 mm. Separately, PDMS 
is spin-coated onto a wafer at 1,500 r.p.m. for 60s for a film thickness of 351m. 
After curing at 90°C for 20min, PDMS forms are removed from the moulds and 
holes are punched at all inlets and outlets. The upper layer is bonded to the wafer- 
adhered thin film after exposing to oxygen plasma at 35 W for 20s in a Deiner Pico 
plasma system (Deiner Electronic GmbH). Holes are punched in the thin film, 
masks are placed as described in ref. 25, and the lower layer is bonded to the thin 
film using the plasma recipe above. 
Ink and matrix materials. Two inks—a ‘fugitive ink and a ‘catalytic ink—are 
formulated for EMB3D printing. The fugitive ink is prepared by adding 27 wt% 
gel of Pluronic F127 to ice-cold, deionized, ultra-filtrated (DIUF) water, followed 
by mixing in a planetary mixer for 5 min at 2,000 r.p.m., and storing at 4°C. The 
fugitive ink is not used until the Pluronic F127 completely dissolves in solution. The 
ink is prepared for printing by loading the solution at 4°C in a 3-cm? syringe barrel 
(EFD Nordson) and centrifuged at 3,000 r.p.m. for 5 min to degas. For EMB3D 
printing, the barrel of the fugitive ink is fitted with a stainless steel nozzle (0.15-mm 
inner diameter; EFD Nordson). 

The catalytic ink is prepared by first synthesizing and then dissolving a dia- 
crylated Pluronic (F127-DA) at 30 wt% concentration with a solution of Irgacure 
2959 (at 0.5 wt%, BASF) in DIUF water at 4°C. The F127-DA is synthesized under 
an inert nitrogen atmosphere by first adding 400 ml of dry toluene (Sigma-Aldrich 
Co.) to a three-neck flask fixed to a condenser with circulating cold water and 
magnetically stirred at 300 r.p.m. 70g of Pluronic F127 (Sigma-Aldrich Co.) is 
then dissolved in the toluene after heating the solvent to 60°C. After the solution 
is allowed to cool to room temperature, triethylamine (5.6 g, Sigma-Aldrich Co.) 
is added to the solution, followed by the drop-wise addition of acryloyl chloride 
(5g, Sigma-Aldrich Co.) with continued stirring, both at a molar ratio of 10:1 with 
the Pluronic F127. The reaction mixture is stirred overnight and maintained in 
the inert atmosphere. The diacrylated Pluronic F127 (F127-DA) product is then 
filtered from the yellow triethylammonium hydrochloride by-product and precip- 
itated from the filtered solution with hexane (Sigma-Aldrich Co.) at a 1:1 volume 
ratio. The F127-DA is obtained through a second filtration step and allowed to dry 
in a chemical hood for at least 24h. This protocol is adapted from ref. 13. For each 
gram of this base F127-DA mixture, 100 mg of PEG-DA is added, and this solution 
is mixed in a planetary mixer for 1 min at 2,000 r.p.m. and degassed for 3 min at 
2,200 r.p.m. This mixture is then stored in the dark at 4°C. Finally, 5 w/w% Pt black 
(Sigma-Aldrich Co.) is added to this base solution at 4°C and mixed in a planetary 
mixer for 5 min at 2,000 r.p.m. The Pt-filled F127-DA physically gels during mixing, 
facilitating loading into a UV-blocking 3-cm? syringe barrel (EFD Nordson) for 
printing. This catalytic ink is freshly prepared for each print session, as the Pt black 
slowly cross-linked the acrylate moieties present in the ink. After EMB3D print- 
ing, the catalytic ink is cross-linked for 15 min at 18 mW cm? under a UV source 
(Omnicure EXFO). For EMB3D printing, the syringe barrel housing this ink is 
fitted with a stainless steel nozzle (0.33-mm inner diameter; EFD Nordson). 

Two matrix materials are developed for fabricating fully soft robots. The first 
matrix, referred to as the ‘body matrix; is prepared by blending two silicone-based 
materials: Sylgard 184 and SE 1700 (Dow Corning Corp.). SE 1700 is a silicone 
elastomer paste that contains fumed silica nanoparticles. Sylgard 184 PDMS is 
used to dilute SE 1700 to achieve the desired rheological response for embedded 
3D printing. After exploring several blends, we found that the optimal body matrix 
is composed ofa 1:1 mass ratio of SE 1700 (4:1 ratio of base to hardener) and Sylgard 
184 (10:1 ratio of base to hardener). This matrix is prepared by mixing the blend in 
a planetary mixer at 2,000 r.p.m. for 3 min with degassing at 2,200 r.p.m. for 2 min. 
The second matrix, referred to as the ‘fuel reservoir matrix, is prepared by mixing 
Part A Ecoflex 00-30 to Part B Ecoflex 00-30 (with 1.2 w/w% Slo-Jo Platinum 
Silicone Cure Retarder and 1.2 w/w% Thivex, Smooth-On Inc.) in a 1:1 ratio. 
The matrix is prepared in a planetary mixer at 2,000 r.p.m. for 1.5 min with degas- 
sing at 2,200 r.p.m. for 1 min. 

Last, the ‘fugitive plug’ material used to prevent ingress of the body matrix mate- 
rial into the soft controller is prepared before printing by first synthesizing and then 
mixing a diacrylated Pluronic material (F127-DA) (at 30 wt% in a 0.5 wt% solution 
of Irgacure 2959 in deionized water) with F127 (at 30 wt% in deionized water) at a 
mass ratio of 1:4. The fugitive plug is stored in the dark at 4°C in a syringe. When 


used, the fugitive plug material is allowed to physically gel before it is cross-linked 
for 3min at 6mW cm * under a UV source. 

Rheological characterization. All rheological measurements are carried out 
using a controlled-stress rheometer (DHR-3, TA Instruments) equipped with a 
40-mm-diameter, 2° cone and plate geometry. In all experiments, the fugitive and 
catalytic inks are equilibrated at room temperature for 1 min before testing; the 
fuel reservoir and body matrix materials are equilibrated for 20 min and 10 min, 
respectively, to simulate the times at which octobot printing began with each 
material. Shear storage moduli are measured as a function of shear stress at a 
frequency of 1 Hz. 

The body matrix materials are characterized by both flow sweep and flow ramp 
tests to determine their rheological response (Extended Data Fig. 2). In addition, 
three-phase modulus recovery tests are carried out to quantify the recovery time 
of the body matrix stiffness after applying a shear stress that exceeds the equilib- 
rium yield stress, 7, (Extended Data Fig. 3). In the first set of experiments, flow 
sweeps from low (10-7 s~!) to high (10? s~!) shear rates are carried out immediately 
followed by flow ramps from high to low shear rates. In the second set of experi- 
ments, shear storage (G’) and loss (G”) moduli are measured during three phases 
of applied shear stresses (at 1-Hz frequency): 1 Pa for 3 min; 100 Pa for either 1s, 
10s or 100s; and 1 Pa for 30 min. We defined their thixotropic recovery time as the 
instant G’ = G”, or when tan(6) = G”/G’ = 1, where 6 is the phase angle. 
Actuator characterization. Actuators are printed into special actuator character- 
ization moulds by EMB3D printing and then auto-evacuate. To prepare them for 
characterization, they are first released from mould assembly and then a 1-mm hole 
is created with a biopsy punch (Miltex Inc.), which serves as the air inlet. Finally, the 
actuator is pressurized slightly to ensure inflation. Each actuator design is tested for 
angular displacement (that is, the actuator is allowed to deflect unconstrained and 
the total displacement angle is measured) and blocked force (that is, the actuator 
is constrained from deflection and resultant force is measured; see Extended Data 
Fig. 7). For each actuator, break-in testing consists of five cycles, in which actuator 
air pressure is slowly (over about 30s) ramped up to the pressure set point, then 
slowly (over about 30s) ramped down to ambient. Pressure set point for the first 
cycle is Po and the set point for all subsequent cycles is P; (Extended Data Table 1). 
Data acquisition consists of five additional cycles for each actuator, in which air 
pressure is cycled as above to pressure set point P}. 

To characterize their angular displacement, actuators are plumbed with regulated 
compressed air and mounted vertically between a matte black background and a 
Sony NEX3 digital camera for video data acquisition. Actuators are pressurized with 
five break-in cycles as described above, followed by five data-acquisition cycles. 
As above, the first break-in cycle is to Pp and all subsequent break-in and data- 
acquisition cycles are to P;. Video data are analysed using the Image] image analysis 
platform (NIH.gov) to obtain the bend angle versus pressure for each actuator. 

For blocked-force characterization, individual actuators are mounted on a fixed 
platform beside an Instron model 5544 materials testing frame (Illinois Tool Works 
Inc.). The actuator is lowered until just above the force sensor portion of the test- 
ing frame and the actuator is plumbed with regulated compressed air. Actuators 
typically behave differently upon the initial few actuations versus subsequent actu- 
ations, owing to the Mullins effect’. Each actuator therefore receives five break-in 
cycles before data acquisition. Air pressure and actuator force data are recorded on 
the Instron testing frame data acquisition system at 100-ms intervals. 

Mould fabrication. Octobot moulds are fabricated inside a CNC machined acetal 
mould equipped with two locating pins to mount the soft controller. Their desired 
shape is modelled in SolidWorks (Dessault Systemes SOLIDWORKS Corp.). 
A negative mould is modelled and output in Parasolid format for file transfer. 
MasterCAM (CNC Software, Inc.) is used to develop all machining tool paths and 
to export the final G-code for final fabrication. Blanks (12.7 cm x 7.6cm x 2.54cm) 
were cut from black acetal (Delrin) (McMaster Carr). Acetal is used, owing to its 
dimensional stability, and 2.54-cm-thick stock is chosen to prevent warping during 
machining and repeated octobot curing cycles. Octobot moulds are produced by 
CNC milling on a HAAS OM-2A vertical machining centre (HAAS Automation 
Inc.). 1-mm dowel pins are pressed into drilled holes for controller mounting. 
Soft robot assembly. A custom-designed, multi-material 3D printer (ABL 10000, 
Aerotech Inc.) with four independently z-axis addressable ink reservoirs is used 
to pattern fugitive and catalytic inks within the octobot matrices”’. All G-Code 
for printing is generated from Python-based software (MeCode, developed by 
J. Minardi). Prior to EMB3D printing, Ecoflex 30 (Smooth-On, Inc.) is first prepared 
with 1 wt% Slo-Jo and 0.25 wt% Thivex (both with respect to Part A) by mixing 
in a Thinky planetary mixer for 1.5 min with a 1-min degas cycle. This uncured 
Ecoflex 30 is cast into the actuator layers of the octobot mould and degassed in 
a vacuum chamber for 3 min. A glass slide is used to remove excess material and 
create smooth surfaces that will ultimately become the extensible layers of the actu- 
ators. The moulds are then placed in a 90°C oven for 30 min to cure the Ecoflex, 
removed, and trimmed of excess material as necessary. 
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A soft controller is then loaded onto the press-fit pins placed in the printing 
mould with the polyimide (Kapton) tape still adhered. Registration coordinates and 
print heights are then taken from the cured Ecoflex layers in the actuators and in all 
inlets of the soft controller; these are essential for EMB3D printing and provided 
to the custom print software. The fuel reservoir and body matrix materials are pre- 
pared as described previously. While the body matrix material is mixing, the fuel 
reservoir matrix is deposited in the fuel reservoir region of the printing mould. It is 
then degassed for 3 min to ensure no trapped gas is present. Excess bubbles in the 
uncured fuel reservoir matrix are removed with a pipettor. The non-gelled, chilled 
fugitive plug is then filled throughout the soft controller via injection through the 
inlets. While the fugitive plug is still in the liquid state, it is briefly degassed in a 
vacuum chamber. The fugitive plug material is then allowed to physically cross-link, 
excess gel is scraped from the top of the tape, the tape is removed, and the fugi- 
tive plug is photo-cross-linked with a UV source at 6mW cm? for 3 min. After 
the gels are cross-linked, the body matrix is cast within the mould, covering the 
fuel reservoir matrix and the fugitive plug-filled soft controller and degassed for 
1-3 min in situ. Again, excess bubbles are removed with a pipettor, excess material 
is scraped off and away from the mould with a glass slide, and EMB3D printing of 
the fugitive and catalytic inks begins. After printing, the entire mould is cured at 
18mW cm * for 15 min to crosslink the catalytic ink. The mould is then transferred 
toa 90°C oven, where the matrix materials cross-link. The octobot is removed from 
the mould and kept at 90°C for 4 days to facilitate auto-evacuation of the inks. 

After auto-evacuation, the octobot is release-cut from the surrounding matrix 
material using a CO; laser (Universal Laser Systems) and cleaned with isopropyl 
alcohol and water. Sylgard 184 PDMS (Dow Corning Corp.) is poured into the 
open cavity of the octobot above the soft controller to a height of 1.5mm and 
cured at 90°C for 20 min. A 1-mm biopsy punch (Miltex Inc.) is used to punch 
holes through the newly poured PDMS layer and into the fuel inlets. Dyed water 
is injected into these holes to inflate the fuel tanks, flow through the system and 
insure proper bot function. Holes are punched in the downstream vent orifices to 
allow the water to vent from the system. 

The octobot is loaded into an acrylic tank outfitted with a backlight to highlight 
coloured fuel as it flows through the system. Aqueous hydrogen peroxide (90 wt%, 
HTP grade, Peroxychem) is diluted to 50 wt% and samples dyed red and blue are 
filled into two syringes prepared with this liquid fuel mixture. The syringes are 
loaded onto a syringe pump, and connected to the octobot via 1-mm-diameter 
silicone rubber tubing. Water is flowed into the acrylic tank to wash away dye in the 
octobot exhaust stream and drained into a nearby sink. The syringe pump flows fuel 
at a rate of 3ml min’ (each syringe) into the octobot for 10s. The silicone rubber 
tubing is removed with tweezers from the octobot, which is allowed to operate 
untethered. The octobot alternates actuation until fuel pressure is insufficient to 
switch the oscillator and alternating actuation ceases. 

Imaging and videography. Photographs and supporting videos are acquired 
with a digital SLR camera (Canon EOS 5D Mark I], Canon USA Inc.) and a 4K 
video (Blackmagic Production 4K, Blackmagic Design). Photos are cropped using 
Inkscape vector graphics editor (http://www.inkscape.org) and video sequences are 
clipped from raw footage and exported using iMovie (Apple Corp; titles are added 
using Premiere Pro, Adobe Systems, Inc.). All print parameter measurements 
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and images of EMB3D printed features in octobots are taken with a digital zoom 
microscope (VHX-2000, Keyence). Their mean values and standard deviations are 
determined from three samples printed at each print speed of interest. 
Considerations for future microfluidic controllers, logic and actuation. The 
octobot represents a minimal soft robotic system that demonstrates our integrated 
design and fabrication strategy. In our self-contained microfluidic controllers, 
system scaling is limited by fuel flow rate, on-board fuel supply, downstream 
decomposition/expansion and actuation-network design. At 50% concentration by 
weight, 1 g of aqueous hydrogen peroxide expands to approximately 200 ml of gas 
under ambient pressure and temperature conditions'*. Our oscillator is designed 
to operate at 40,11 min~! per channel (80,11 min“ total), and our actuators inflate 
with approximately 0.2 ml at 50 kPa, the equivalent of 0.3 ml at ambient pressure. 
Hence, with each channel inflating four actuators, our current design has a theoret- 
ical maximum oscillation rate of 5.5 switches per minute. Although the controller 
and actuators could be redesigned for increased performance, any system scaling 
would require careful balancing of fuel supply, flow rate and actuator requirements. 
Alternative actuator designs are possible (see Extended Data Fig. 7), but our current 
actuators produce 0.04 N; therefore, two actuators would theoretically be sufficient 
to lift the 7-g robot. 

On the basis of these demonstrated capabilities, we anticipate that more sophis- 
ticated microfluidic control systems and logic devices could be readily incorpo- 
rated within these printed and moulded robots. For example, fluidic versions of 
electric logic gates have been reported, including NAND/NOR, AND/OR and 
XOR/XNOR*-*, and flip-flops and gain valves***5. These complex systems are 
based on well-established, electrical design rules. Implementation of these systems 
would enable more complex actuation strategies, such as multi-degree-of-freedom 
actuators in which planned limb motion would prescribe a true gait with aerial 
and ground phases to lift and propel the soft robot. Alternatively, actuators could 
be designed to take advantage of material elasticity, in which actuation performs 
flexion and abduction, and passive limb elasticity provides extension and adduction. 
One could even envision an actuation strategy in which pneumatic channels act as 
sensors, providing true closed-loop feedback to the controller. 
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Extended Data Figure 1 | EMB3D printing of an octobot. a, An EMB3D 
printing mould is machined from acetal. b, The hyperelastic layers needed 
for actuation are cast and cross-linked in the actuator regions of the 
mould. c, A soft controller protected with a polyimide tape mask is loaded 
onto the pins of the EMB3D printing mould. d, The fuel reservoir matrix 
material is carefully loaded into the fuel reservoir area of the mould and 
degassed under vacuum. e, Liquefied fugitive plug material is manually 
loaded into the soft controller via the inlets and briefly degassed. f, The 
protective tape is removed after the fugitive plug material physically gels, 


and the fugitive plug is photo-cross-linked. g, The body matrix material 

is cast into the mould and degassed. h, Any excess body matrix material 

is removed with a squeegee step, EMB3D printing begins, and the entire 
mould and EMB3D-printed materials are placed in a 90°C oven to cross- 
link. i, After 2h, the cross-linked octobot is removed from its mould and 
kept at 90°C for a total of 4 days to ensure complete auto-evacuation of the 
aqueous fugitive inks. j, Before operation, excess body matrix material is 
removed via laser cutting. k, The final octobot, shown here in a close-up 
view, is prepared for operation. 
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Extended Data Figure 2 | Rheological properties of the body matrix. 
a, Schematic illustration of the behaviour of the body matrix during the 
EMB3D printing process. (i) When the body matrix is at rest, the fumed 
silica fillers within the silicone material form a percolated network, giving 
rise to its equilibrium, at-rest shear yield stress, T,o. (ii) As the nozzle 
travels through the matrix during EMB3D printing, the matrix is yielded 
and the percolated filler network is disrupted, decreasing the apparent 
yield stress of the matrix material, 7,,. (iii) Sufficient deformation can 
completely disrupt the fumed silica microstructure and completely 
eliminate the yield stress of the matrix material (7,,; — 0 Pa). (iv) The 
fumed silica network does not immediately recover when it returns to 

a quiescent state. (v) Over time, the network slowly restructures to (vi) 
its equilibrium microstructure, and 7), — Tyo. b, c, Log-log plots of 
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apparent viscosity (b) and corresponding shear stress (c) versus shear rate 
for various PDMS matrix formulations, which are prepared by blending 
Sylgard 184 (10:1 ratio of base to hardener) and SE 1700 (4:1 ratio of base 
to hardener) at various mass fractions. The formulations are listed by the 
weight ratio of SE 1700 used (0.0,0.33, 0.5, 0.67 and 1.0). Closed and open 
circles in c represent measurements taken during the flow sweep and flow 
ramps of the thixotropic loop studies, respectively. The final body matrix, 
formulated from the 50 wt% SE 1700 blend, shows clear thixotropic 
behaviour and a substantial decrease in yield stress upon yielding. Blends 
with higher concentrations of filler particles show diminished thixotropic 
behaviour, and the yield stress is not eliminated during nozzle translation. 
Consequently, crevices or air pockets form during printing, with matrix 
materials possessing higher concentrations of fumed silica. 
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Extended Data Figure 3 | Storage modulus recovery of the body 
matrix after yielding. a, A plot of storage modulus (G’) as a function 

of time illustrates how the modulus of the body matrix recovers during 
three-phase thixotropy tests. After a probe phase, a shear stress of 100 Pa 
is applied for varying times during a deformation phase, resulting in 
temporary fluidization of the matrix material. During the recovery phase, 
the modulus increases over time. b, The ratio of the loss modulus to the 
storage modulus, G’/G’ = tan(6), is plotted as a function time for each 

of the recovery phases measured in a. The onset of recovery of the yield 
stress of the body matrix material—and the onset of fumed silica filler 


percolation in a recovering matrix material—is assumed to be the moment 
G' =G" or tan(6) = 1 (horizontal dashed line). Therefore, the ‘recovery 
time’ of the body matrix material (indicated by the vertical dashed lines) is 
approximately the time at which tan(6) = 1 after deformation. Because the 
momentary deformation incurred by nozzle translation through a discrete 
volume of matrix material during EMB3D printing happens within a 

time period shorter than 1s and with a magnitude of less than 100 Pa, 

the thixotropic recovery time of the body matrix material is less than 

200 s—the approximate time it takes the body matrix to recover after being 
sheared by a 100-Pa stress for 1s. 
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Extended Data Figure 4 | Rheological and printing behaviour of inks with fluorescently dyed fugitive inks (red, not auto-evacuated) and 
and matrix materials. a, A log-log plot of apparent viscosity asa function _ hyperelastic actuator layers (blue) fabricated by moulding and EMB3D 
of shear rate for the fugitive ink (red), catalytic ink (black), body matrix printing. 


material (blue) and fuel reservoir matrix material (green). b, An octobot 
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Extended Data Figure 5 | Auto-evacuation of the fugitive and catalytic inks. Photographs of the reaction chambers of an octobot showing the 
upstream portions of the actuator networks (top) and a one-pad actuator (bottom) at various times, t. These photographs reveal the auto-evacuation of 


the fugitive and catalytic inks, which leaves behind open channels that serve as mesofluidic features. 
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a : ‘ & 
Extended Data Figure 6 | Infilling the soft controller from the fuel inlets. Water (with red or blue dye) is introduced into the fuel reservoir via the fuel 
inlets. Continuity between the fuel reservoirs, soft controller and downstream EMB3D-printed components is possible because of the fugitive plugs, 


which auto-evacuate along with the EMB3D-printed inks. Scale bar, 5mm. 
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Extended Data Figure 7 | Characterization of EMB3D-printed 
actuators. a, CAD model of a four-bladder actuator design. Other than 
bladder number, the design is similar to the actuators illustrated in 

Fig. 4a. b, EMB3D-printed actuator is shown before inflation. c, Actuator 
inflated to working pressure. Scale bars in b and c, 5mm. d, Pressure 
versus displacement curves for four-bladder actuators with varying 
hyperelastic-layer thickness, h (see legend; values in micrometres). The 
data points reported are the mean inflation displacement angles for two 


e 
= 0.15 
oO 
oO 
5 0.10 
nol 
© 0.05 
8 
mm 0.00 _—— 
00 O01 02 03 04 05 O06 0.7 
Pressure (bar) 
f 
= 0.15 
oO 
2 
S 0.10 
no] 
L 0.05 
(3) 
ie} 
mo 0.00 
06 0.7 00 O01 02 03 04 05 06 0.7 


Pressure (bar) 


representative actuators over five inflation cycles and the shading indicates 
95% confidence intervals. e, f, Blocked force versus pressure curves for 
two-bladder (e) and four-bladder (f) actuators of varying hyperelastic- 
layer thickness, h (see legends; values in micrometres). The lines shown 
are third-degree polynomial fits of data collected from five inflation- 
deflation cycles of representative actuators. The shaded regions indicate 
95% confidence intervals. Detailed procedures for characterizing actuator 
performance are provided in Methods. 
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Extended Data Figure 8 | Autonomous switching between actuating 
states during octobot operation. a, Switching in the soft controller 
was tracked with time according to the octobot operation recorded in 
Supplementary Video 5. b, The corresponding inflation of actuators 
associated with the blue and red and actuation states are also reported 
with time. 
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Extended Data Table 1 | Break-in and working gauge pressures for printed actuators 


Two bladder actuators Four bladder actuators 


Hyperelastic layer 
thickness (ym) Po (Bar) =P: (Bar) Po (Bar) —-P1 (Bar) 


500 0.35 0.3 0.6 0.55 
750 0.4 0.35 0.65 0.6 
1000 0.45 0.4 O07 0.65 
1250 0.5 0.45 0.75 0:7 
1500 0.55 0.5 0.8 0.75 


Before characterizing an actuator, break-in testing consisted of five cycles, in which actuator air pressure is slowly (over about 30s) ramped up to the pressure set 
point, then slowly (over about 30s) ramped down to ambient pressure. The pressure set point for the first cycle is Po and the set point for all subsequent cycles is Pi. 
Data acquisition consists of five additional cycles for each actuator, in which air pressure is cycled as above to pressure set point P}. 
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Biodiversity at multiple trophic levels is needed 
for ecosystem multifunctionality 
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Many experiments have shown that loss of biodiversity reduces the 
capacity of ecosystems to provide the multiple services on which 
humans depend!”. However, experiments necessarily simplify 
the complexity of natural ecosystems and will normally control 
for other important drivers of ecosystem functioning, such as the 
environment or land use. In addition, existing studies typically focus 
on the diversity of single trophic groups, neglecting the fact that 
biodiversity loss occurs across many taxa** and that the functional 
effects of any trophic group may depend on the abundance and 
diversity of others®®. Here we report analysis of the relationships 
between the species richness and abundance of nine trophic groups, 
including 4,600 above- and below-ground taxa, and 14 ecosystem 
services and functions and with their simultaneous provision 
(or multifunctionality) in 150 grasslands. We show that high species 
richness in multiple trophic groups (multitrophic richness) had 
stronger positive effects on ecosystem services than richness in 
any individual trophic group; this includes plant species richness, 
the most widely used measure of biodiversity. On average, three 
trophic groups influenced each ecosystem service, with each 
trophic group influencing at least one service. Multitrophic 
richness was particularly beneficial for ‘regulating’ and ‘cultural’ 
services, and for multifunctionality, whereas a change in the 
total abundance of species or biomass in multiple trophic groups 
(the multitrophic abundance) positively affected supporting 
services. Multitrophic richness and abundance drove ecosystem 


functioning as strongly as abiotic conditions and land-use intensity, 
extending previous experimental results”* to real-world ecosystems. 
Primary producers, herbivorous insects and microbial decomposers 
seem to be particularly important drivers of ecosystem functioning, 
as shown by the strong and frequent positive associations of their 
richness or abundance with multiple ecosystem services. Our results 
show that multitrophic richness and abundance support ecosystem 
functioning, and demonstrate that a focus on single groups has led 
to researchers to greatly underestimate the functional importance 
of biodiversity. 

Global change is causing species loss across many trophic groups*", 
with potential effects on the services that ecosystems provide to 
humans!”. The functional consequences of a decline in biodiversity 
across multiple trophic groups are hard to predict from studies focus- 
ing on single taxa, as the functional effects of different groups may 
complement or oppose each other**!”. The effects of the diversity 
of plants and microbes are complementary, maximizing rates of 
nutrient cycling’; plant and herbivore diversity, on the other hand, 
have opposing effects on biomass stocks!°!3, Consequently, we know 
very little about the relative effect of changes in the diversity of different 
trophic groups on the provision of individual*>**'*'4 or multiple 
(multifunctionality)!!’° ecosystem services. 

In addition to decreasing species richness, global change is altering 
the total abundance (total number of individuals or amount of 
biomass within communities) of multiple trophic groups*. Changes in 
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abundance could mitigate or exacerbate the functional consequences 
of species loss!*!” by influencing the ability of each trophic group to 
capture resources. However, studies normally focus on the effects 
of community evenness or of dominant species!®-0, whereas the 
simultaneous effects of changes in richness and total abundance on 
the functioning of ecosystems have been largely unexplored®'®. The 
relative importance of richness and abundance may depend on the 
function or service of interest. Total abundance could be a main driver 
of biogeochemical process rates (for example, biomass production’’, 
nutrient capture and cycling). By contrast, ecosystem services related 
to biotic interactions, such as pollination or pest control, could be 
predominantly driven by species richness'*®. Ecosystem services also 
depend on abiotic factors and, although experiments show that the 
effects of biodiversity loss on ecosystem functioning are as large as 
those of abiotic drivers”®, it is unclear whether species richness and 
abundance are similarly important in real-world ecosystems®!*?!?, 

We adopted a multitrophic approach to evaluate relationships 
between biodiversity and multifunctionality in 150 real-world 
grasslands. We measured the richness and abundance of species in 
nine trophic groups: primary producers, above- and below-ground 
herbivores and predators, detritivores, soil microbial decomposers, 
plant symbionts, and bacterivores. These trophic groups comprised 
4,600 plant, animal and microbial taxa, and were measured along- 
side 14 ecosystem variables (proxies for both functions and services, 
hereafter referred to as services). These are related to the four main 
types of ecosystem services”’: provisioning (fodder production and 
quality), supporting (potential nitrification, phosphorus retention, root 
biomass and decomposition rate, mycorrhizal colonization and soil 
aggregate stability), regulating (soil carbon levels, pollinator abundance, 
pest control, resistance to pathogens) and cultural services (recreation 
benefits of flower cover and bird diversity). We fitted linear models to 
our data to test for both positive and negative relationships between 
the richness and abundance of species within the nine trophic groups 
and each ecosystem service, the four types of services (provisioning, 
supporting, regulating and cultural), and ecosystem multifunctionality”” 
(see Methods). We accounted for potential confounding factors by 
performing our analyses on residuals, after controlling for variability 
in land-use intensity, soils and climate. We compared our results with 
models that included only plant-species richness, the most commonly 
used measure of biodiversity?!?4>, and with models that included the 
richness and abundance of each individual trophic group. Additional 
analyses compared the amount of variance explained by, and the effect 
size (standardized slope) of, multitrophic richness and abundance with 
those of land-use intensity and environmental variables. 

Effects on individual ecosystem services, service types, and multi- 
functionality were better predicted by changes in multitrophic rich- 
ness and abundance than by those in the richness or abundance of 
any individual trophic group (Fig. 1 and Extended Data Fig. 1). The 
most parsimonious models included the richness and/or abundance 
of 3.14 + 0.36 trophic groups (average + s.e.m. across all 14 services) 
to predict the variation in each ecosystem service. These results 
remained when using raw data instead of environment-corrected 
residuals (Extended Data Fig. 2), different combinations of ecosys- 
tem services (Extended Data Fig. 3), and even when we accounted for 
well-established links between a predictor and service in our models 
(for example, plant cover versus biomass). Multitrophic richness had 
stronger and more positive relationships with the provisioning, regulat- 
ing and cultural services than plant richness alone (Fig. 1 and Extended 
Data Fig. 1). For example, both plant and predator richness were related 
to high levels of pest control, suggesting that combined top-down and 
bottom-up effects of diversity*® maximize the provision of this regulat- 
ing service. Multitrophic richness also had a more positive effect than 
even the strongest positive-richness effect found across all individual 
trophic groups on the regulating and cultural services. The findings of 
our observational study were supported by a quantitative review of the 
few studies that manipulated the richness of more than one group. Our 
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Figure 1 | Effects of multitrophic richness and abundance on grassland 
functioning. a—d, Variance explained after accounting for the influence of 
‘site’ as random factor (marginal R’, the equivalent of R* for mixed models) 
and standardized effects for each ecosystem service when models included 
abundance and richness of multiple or individual trophic groups. 
e-h, Standardized effects (mean +s.e.m.) of richness and abundance 
(full and hatched bars, respectively) of individual trophic groups on 
each ecosystem service type. Ecosystem services types are plant biomass 
and forage quality (provisioning); potential nitrification, phosphorus 
retention, mycorrhizal colonization, soil aggregate stability, root biomass 
and decomposition (supporting); soil carbon, pollinator abundance, pest 
control and resistance to pathogens (regulating); flower cover and bird 
diversity (cultural). 


review showed that including the richness of a second trophic group 
increased the variance in ecosystem functioning by 14-96% for litter 
decomposition", biomass production”, or the number of carbon 
sources used” (Extended Data Table 1). Collectively, our results show 
that high species richness in multiple trophic groups is necessary to 
maintain high levels of ecosystem functioning, particularly for regu- 
lating and cultural services. 

Alongside multitrophic richness, the combined effect of a high 
multitrophic abundance strongly affected ecosystem functioning 
(according to the amount of variance explained and its effect size). 
Multitrophic abundance had positive effects on the provisioning and 
supporting services, but these were generally weaker than those found 
for the individual trophic group that had the strongest positive effect. 
This suggests that an abundance of some trophic groups can dampen 
the effect on ecosystem functioning induced by others. Figure 1, 
for instance, shows that a higher abundance of predators partially 
counteracted the positive effects of abundant herbivores on supporting 
services. Conversely, a high level of richness in a given trophic 
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Figure 2 | Functional importance of multiple trophic groups. 

a, Proportion of the multifunctionality metrics (calculated using every 
possible combination of 1-9 services; N= 501; see Methods) in which 

the biotic attributes (richness and/or abundance) of each trophic group 
was included in the most parsimonious model. b, Functional effects 
(standardized slopes (mean +s.e.m.) in the model fitted to all 14 services) 
of the richness (open bars) and abundance (hatched bars) of each group. 
Bars are shown only for the predictors included in the most parsimonious 
models. Green and brown cartoons indicate above- and below-ground 
trophic groups, respectively. 


group generally complements the positive effects of other trophic 
groups on ecosystem services (see a comparison of multitrophic 
and unitrophic richness in Fig. 1). These contrasting effects caused 
multitrophic abundance to increase ecosystem multifunctionality 
only at low-to-moderate levels (Extended Data Fig. 1). Overall, our 
results underline the important role of species richness in driving 
the functioning of ecosystems!!*"!7*45, while also highlighting the 
often-overlooked effect of total biomass abundance on the supporting 
and provisioning services. 

To test how generally applicable the trends in relationships between 
multitrophic richness and abundance were, we calculated multifunc- 
tionality metrics using all possible combinations of services. High 
multitrophic richness or abundance had increasingly positive effects 
as more services were considered and this effect was consistent across 
a wide range of levels of multifunctionality (Extended Data Fig. 3). To 
further explore this result, we calculated the similarities in the identities 
of the trophic groups driving a given pair of ecosystem services (the 
functional overlap, 6 (ref. 25)). On average, we found functional over- 
laps lower than 30% (6 =0.27 £0.03, mean +s.e.m.), similar to results 
found for plant species in grassland experiments (6 =0.19-0.49)”°. This 
demonstrates low multitrophic redundancy and means that different 
services are supported by different trophic groups (Fig. 1 and Extended 
Data Fig. 1). We also found that different groups positively affected 
multifunctionality when it was calculated according to scenarios 
representing different land-use objectives (Extended Data Fig. 4). 
Finally, five of the nine trophic groups had the strongest net-positive 
effects on at least one ecosystem service (for example, primary 
producers on pest control, soil microbial decomposers on aggregate 
stability; Extended Data Fig. 1), with each group affecting at least one 
service. Collectively, these results show the low functional redundancy 
found between the multiple trophic groups studied, explaining why 
high multitrophic richness is needed to support high levels of ecological 
multifunctionality or to promote a larger number of ecosystem services. 

The relationships between multitrophic richness, multitrophic abun- 
dance and ecosystem services were not always positive (Figs 1, 2 and 
Extended Data Fig. 1), consistent with previous studies”””*. Negative 
relationships might be explained by interference between species 
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Figure 3 | Biotic versus abiotic drivers of ecosystem functioning. 
Variation partitioning for three predictor categories in our statistical 
models: environment, species richness and total abundance (details 

in Methods). Diagrams show the average across services within each 
ecosystem service type (detailed results are in Extended Data Figs 2 and 
5). Shown are the unique variance explained by each predictor category, 
the shared variance between these categories (intersections of circles), and 
the variance not explained by the models (the residual). Biota refers to the 
total variance explained by species abundance and richness combined. 
Standardized effect sizes are shown as bar plots. 


within a given trophic group’” or by compositional shifts leading to 
declines in ecosystem functioning!””°”’. Despite these negative associ- 
ations or harm to services, our results suggest that the most important 
trophic groups for maintenance of the services considered are above- 
ground herbivorous insects, primary producers and soil microbial 
decomposers. The richness or abundance of these trophic groups 
were most often correlated to ecosystem multifunctionality (43-72% 
of the 501 possible combinations between the services we measured), 
and had net-positive effects across all services (Fig. 2). These three 
groups also showed strong and frequent positive associations with the 
four main ecosystem service types (Fig. 1). These results agree with 
other studies that have identified plants and soil microorganisms as 
key drivers of ecosystem functioning'»'*'>”’, extending these findings 
to the richness and abundance of different trophic groups, including 
primary producers and consumers both above and below ground. The 
species richness of some of these functionally important trophic groups 
relate to whole-ecosystem diversity*”” and, thus, management strategies 
focused on them may foster synergies between biodiversity conserva- 
tion and high multifunctionality levels. 

The relative importance of both multitrophic richness and 
abundance compared to the environmental drivers of ecosystem 
functioning has been rarely studied outside of experiments or indi- 
vidual functions”*"!5124, We therefore calculated the proportion of 
variance in ecosystem functioning that was explained by multitrophic 
richness, abundance and environmental (soil, topography and land- 
use) factors. Our models accounted for a large proportion (54-64%) 
of the variance in the provisioning, supporting, regulating and cultural 
ecosystem service types (Fig. 3 and Extended Data Fig. 5). Multitrophic 
richness and abundance explained at least as much of the variance in 
ecosystem functioning as abiotic conditions or land-use intensity did, 
and generally had stronger effects (Fig. 3; Extended Data Fig. 5). These 
results provide evidence that biodiversity is of comparative importance 
to environmental factors in driving ecosystem functioning. This is true 
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not only for individual functions in small-scale experiments”, but also 
for multiple ecosystem services in realistic landscapes (see refs 15, 24). 

Our study shows that the functional importance of biodiversity in 
real-world ecosystems has been greatly underestimated, as a result 
of focussing on individual trophic groups. We demonstrate here that 
the functional effects of multitrophic richness and abundance are as 
strong as, or even stronger than, those of the environment or land-use 
intensity. We identified primary producers, above-ground herbivores 
and soil decomposers as particularly important trophic groups for 
maintaining a functioning ecosystem. Our results suggest that it is 
important to preserve high levels of species richness, abundance or 
both within a wide range of taxa. This must include taxa often ignored 
by conservation efforts such as soil microbial decomposers’, or those 
considered pests in agricultural systems such as herbivorous insects, if 
we are to promote high levels of the multiple ecosystem services upon 
which human well-being depends. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 
Study sites. We selected a total of 150 grassland sites (50m x 50m) in three regions 
of Germany (50 sites per region) to cover a gradient of land-use intensities, charac- 
terized by contrasting grazing, fertilization and mowing levels (www.biodiversity- 
exploratories.de, ref. 31). The regions in the south-west (Schwabische Alb) and the 
north-east (Schorfheide-Chorin) are UNESCO Biosphere Reserves, whereas the 
central region is in and around the Hainich National Park. The three regions differ 
substantially in geology, climate and topography*’, covering a range of ~3°C in 
mean annual temperature and 500 mm in annual precipitation. Plots in each region 
cover the range of land-use intensities typical for Central European grasslands. 
We obtained information on land use via questionnaires sent to land owners, 
asking about the number and type of livestock (converted to livestock units) and 
the duration of grazing in each plot, the fertilization (from which we calculated 
the amount of nitrogen added), and the mowing (number of cuts per year*)*?), 
We used this information to calculate three standardized indices summarizing 
grazing, fertilization and mowing intensity (see ref. 32 for full methodological 
details). 
Diversity measures. At each site, we measured the species richness and abun- 
dance of nine functional groups using standard methodologies (Extended Data 
Table 2). In total we observed about 4,600 taxa on the 150 grasslands studied. The 
nine trophic groups were: primary producers (vascular plants and bryophytes), 
below-ground herbivores (herbivorous insect larvae sampled in the soil), below- 
ground predators (carnivorous insect larvae sampled in the soil), detritivores 
(insects and Diplopoda feeding on leaf litter and other detritus), soil microbial 
decomposers (soil bacteria), above-ground herbivores (insects feeding solely 
on above-ground plant material), above-ground predators (carnivorous insects, 
spiders and Chilopoda), plant symbionts (arbuscular mycorrhizal fungi), and 
bacteria-feeding protists (heterotrophic flagellates and ciliates). Lichens and 
omnivores were not considered in our analyses as they were too rare. We directly 
measured species richness for most groups, but richness was quantified as family 
richness for below-ground insects and soil bacteria and as the number of opera- 
tional taxonomic units (OTUs) for the mycorrhizae and protists. The abundance 
of each trophic group was also measured using different methods: number of 
individuals for arthropods, amount of cover for vascular plants and bryophytes, 
and relative proportion of sequence reads assigned to each family or OTU for 
protists, soil bacteria and mycorrhiza. To avoid multicollinearity, we did not 
include the abundances of protists or detritivores as they were highly correlated 
(Spearman's p > 0.6) with richness (for more details see Extended Data Table 3). 
We also measured the abundance and richness of foliar fungal pathogens, polli- 
nators and birds; however, to include a broader range of ecosystem services in our 
analyses, we treated these groups as proxies of ecosystem services. Total pollinator 
abundance and the inverse of pathogen abundance were treated as proxies of 
regulating services (pollination and disease regulation), and we used bird-species 
richness as a measure of a cultural ecosystem service. Lepidoptera behave as 
herbivores during juvenile stages and as pollinators when adults. To avoid 
accounting for them twice, we assigned them to only one trophic group 
(pollinators), as the data were counts of the adult butterflies, not the caterpillars. 
Ecosystem functioning measures. At each site, we measured 14 different 
ecosystem variables (both functions and service proxies; Extended Data Table 2) 
and classified them into four types of services following the Millennium Ecosystem 
Assessment”’, These 14 ecosystem services were: i) supporting services related to 
nutrient capture and cycling (root biomass, root decomposition rates, potential 
nitrification (based on urease activity in soil samples), phosphorus retention 
(calculated as a ratio between shoot and microbial phosphorus stock and soil 
extractable phosphorus), arbuscular mycorrhizal fungal root colonization 
(measured as hyphal length), soil aggregate stability (proportion of water-stable 
soil aggregates)); ii) provisioning services related to agricultural value (forage 
production (above-ground plant biomass) and forage quality (based on crude 
protein and relative forage value); iii) regulating services for neighbouring crop 
production or climate regulation (that is, regulating services: resistance to plant 
pathogens, pest control, pollinator abundance and soil organic carbon); or iv) 
cultural services linked to recreation (bird diversity and flower cover). Because the 
values for trophic groups and ecosystem functions varied widely, we standardized all 
variables to a common scale ranging from 0 to 1 according to the following formula: 
STD = (X — Xpnin)/(Xmax — Xmin); where STD is the standardized variable and X, Xinin 
and Xmmax are the target variable, and its minimum and maximum value across all 
sites, respectively. This made slope estimates for different predictors comparable. 
We calculated ecosystem multifunctionality metrics from the 14 services as 
the percentage of measured services (measured services only to correct for the 


fact that some services that had not been measured in all sites) that exceeded a 
given threshold of their maximum observed level across all study sites. To reduce 
the influence of outliers we calculated the maximum observed level as the average 
of the top five sites?”*3. Given that any threshold is likely to be arbitrary, the use 
of multiple thresholds is recommended to better understand the role that biodi- 
versity plays in affecting ecosystem multifunctionality and to understand trade- 
offs between functions of interest”. Therefore, we used four different thresholds 
(25%, 50%, 75% and 90%) to represent a wide spectrum in the analyses performed 
(Extended Data Figs 1-3). As an alternative approach we also calculated multi- 
functionality scenarios, weighting the services differently according to the different 
potential views of stakeholders (that is, stakeholders willing only to promote 
provisioning services versus those trying to maximize cultural and recreation 
services or the sustainability of soils and crops; Extended Data Fig. 4)**. 

Effects of multitrophic richness and abundance on grassland ecosystem services 
and multifunctionality. We used linear models to evaluate the relationships 
between species richness and abundance in the nine trophic groups and each of 
the 14 individual ecosystem services, along with the different multifunctionality 
metrics (four thresholds and the metrics were obtained by weighting each eco- 
system service according to different potential stakeholders’ needs; for example, 
only provision, sustainable soils and crops, or cultural scenarios, see ref. 34). In all 
cases, we used a Gaussian error distribution as the errors of our response variables 
were normally distributed. We report the effects of the different trophic groups 
on the different functions as slopes from the multiple regression model; these 
are corrected for the effects of all other variables in the model. Since our main 
focus was on calculating the independent effects of the richness and abundance of 
the different trophic groups, we corrected them for co-varying factors. Thus, we 
calculated residuals for all our variables (both biotic predictors and functioning 
measures) from linear models including region, land-use intensity (standardized 
measures of mowing, grazing and fertilization intensity) and other important envi- 
ronmental factors (soil type and depth, pH, a topographic wetness index based 
on position within the slope and orientation, and elevation). As an alternative 
to using residuals, we also fitted models with all the environmental and land-use 
predictors (standardized to give comparable coefficients) alongside the diversity 
and abundance measures. These approaches gave very similar results (Extended 
Data Fig. 2). Standardized coefficients of the functional effects of richness were 
very similar, whether or not abundance was included (p= 0.80, P< 0.0001, 
N= 162; data not shown). We also fitted models with the abundance and richness 
of only one individual trophic group to compare the results of the best individual 
trophic group with the multitrophic analyses (Extended Data Fig. 1). Finally, 
we fitted models with only richness of vascular plant species as a predictor. The 
latter is the most common measure of biodiversity”*?!74535-38 and we used it to 
compare our results with those found in previous studies on biodiversity- 
ecosystem functioning relationships. 

We performed model simplification using the stepAIC function in R, and 
further simplified the minimal models produced using AIC by removing all terms 
that were not significant according to F-ratio tests (Extended Data Table 4). Results 
using alternative approaches for model selection are presented in Extended Data 
Fig. 6. We did not fit interactions between the richness and abundance of different 
trophic groups, or between those and environmental factors, as this would require 
a large number of coefficients, would be difficult to interpret and would require 
an even larger data set than ours (see ref. 20 for a study evaluating the interaction 
between land-use and diversity). We did not find evidence of nonlinear relation- 
ships between our predictors and the ecosystem services measured when checking 
all bivariate relationships; thus we did not include nonlinear terms in the models 
to keep them simple. 

Not all trophic groups or ecosystem services were measured on all sites; thus 
different services were analysed using different sized data sets (N ranged between 
111 and 54, depending on the service). The different sampling sizes used were 
not related to the number of trophic groups included in the most parsimonious 
model (Spearman’s rank correlation coefficient p = 0.32), the increase in variance 
explained by vascular plant species richness (= —0.21) or the net effect of richness 
or abundance (p= —0.01 or 0.05, respectively; N= 14 and P > 0.25 in all cases). 
Thus, fitting models differing in sample size for different services did not affect 
our results. 

The inclusion of many predictors in statistical models increases the chance of 
type I error (false positives). To account for this we used a Bernoulli process to 
detect false discovery rates, where the probability (P) of finding a given number 
of significant predictors (K) just by chance is a proportion of the total number 
of predictors tested (N= 16 in our case: the abundance and richness of 7 and 9 
trophic groups, respectively) and the P value considered significant (a = 0.05 in our 
case)**". The probability of finding three significant predictors on average, as we 
did, is therefore, P= [16!/(16 — 3)!3!] x 0.053(1 — 0.05)1°~ 3) = 0.0359, indicating 
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that the effects we found are very unlikely to be spurious. The probability of false 
discovery rates when considering all models and predictors fit (14 ecosystem 
services x 16 richness and abundance metrics) and the ones that were significant 
amongst them (52: 25 significant abundance predictors and 27 significant richness 
predictors) was even lower (P < 0.0001). All analyses were performed using 
R version 3.0.2 (ref. 41). 

Net functional effects of the different trophic groups across ecosystem service 
types. We calculated the net effect of each trophic group on each ecosystem service 
type (provisioning, supporting, regulating and cultural) by fitting all services 
belonging to these types into a single model. To do so, we added two extra pre- 
dictors to our models: ‘service identity’ as a fixed factor, to account for differences 
between individual services, and ‘site as a random factor, to account for correla- 
tions between services, abundance and richness values measured on the same site. 
Since we were interested in the net effects of each group across all services, we did 
not fit interactions between our multitrophic predictors and service identity. The 
net effect across all services was analysed using the same approach, while fitting a 
single model for the 14 ecosystem services at the same time. This approach corrects 
for the fact that the individual service models vary in their explanatory power and 
in the predictor variables included. Fitting all services into a single model allows us 
to obtain a robust estimate of the net functional effect (the standardized coefficient 
from the model) of the abundance and richness of each trophic group on the 
service type of each ecosystem and on ecosystem multifunctionality, together with 
an estimate of its error. If the standardized coefficient was positive, we interpreted it 
as a net overall positive effect of either richness or abundance across all services, or 
on a given service type (Figs 1 and 2). In all cases, we used standardized coefficients 
of the most parsimonious models after model reduction. However, our results 
remained when using other approaches that account for differences in model fit, 
such as multi-model averaging coefficients (coefficients were weighted according 
to the AIC weight of the models in which each predictor is included) or when 
weighting the standardized coefficient for each ecosystem service by the adjusted 
R’ of each model (which should also be comparable across models with different 
response variables; Extended Data Fig. 6). 

Variance partitioning analyses. Variance partitioning analyses (also known as 
commonality analyses) were performed with standard techniques**? based on the 
comparison of variance explained by models including every possible combination 
of variables. Variables were organized by environment (study region, soil type, 
pH, topographic wetness index, grazing and fertilization, with the remaining envi- 
ronmental predictors removed to prevent multicollinearity; shown in Extended 
Data Table 3), species richness (standardized species richness of the nine trophic 
groups) and abundance (standardized abundance of those trophic groups in 
which abundance and richness were not strongly correlated (p < 0.6; shown in 
Extended Data Table 3)). Thus, we fitted a series of seven models for each service 
and multifunctionality metric (at the 25%, 50%, 75% and 90% thresholds) to extract 
the unique and shared variance for each combination of variables (environment 
only, richness only, abundance only, environment + abundance, environment + 
richness, richness + abundance, and all predictors together). Variance-partitioning 
analyses were performed with the full models (without model simplification) to 
allow us to compare between different services. As a consequence, we used R? 
rather than the adjusted R? because, owing to the large number of predictors, some 
adjusted R? values were negative, complicating the extraction of unique variance 
explained by each predictor. Venn diagrams were drawn using Euler APE for 
Windows™. 

To compare the effect size among richness, abundance and environment on 
the different ecosystem services and multifunctionality metrics, we summed the 
standardized coefficients of all predictors from each component (the abundance 
of five trophic groups (abundance), the richness of nine trophic groups (richness), 
and pH, fertilization, grazing, and topographic wetness index (environment)). 
We excluded study region and soil type when summing effects, as these were 
categorical predictors and their coefficients were not straightforward to interpret. 
We performed these calculations for each of the 14 ecosystem services and 
4 multifunctionality metrics in isolation (Extended Data Fig. 4), and for each 
ecosystem service type (Fig. 2) by using models containing all the ecosystem 
services belonging to each type into a single model (again, adding ‘service identity’ 
and ‘site’ as fixed and random predictors, respectively). 

Analysing every possible combination of ecosystem services. Studies on mul- 
tifunctionality are difficult to compare as they include different measures of 
ecosystem functioning. To allow us to generalize our results and to test whether 
multitrophic richness and abundance are more important in supporting higher 
numbers of services simultaneously, we also calculated multifunctionality 
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indices using every possible combination of the services we measured. We did 
this after removing those services with more than 20 missing sites, leaving a 
total of 9 services (501 combinations) as response variables. We calculated 
multifunctionality at the 25%, 50%, 75% and 90% thresholds for all these com- 
binations (Extended Data Fig. 3). We also tested the sensitivity of our analyses 
to missing data by repeating our analyses for every possible combination of 1-13 
of the 14 measured services (16,368 combinations; results for multifunctionality 
calculated with all 14 ecosystem services are presented in Extended Data Figs 1, 2). 
To allow the comparison of models with different services, data gaps were filled 
with the average value of a given service in a given region, which is a conservative 
approach. In both cases (combinations of 1-13 or 1-9 functions), the most 
parsimonious models possible were selected on the basis of their AIC. This avoids 
inflated type I error, caused by fitting a large number of models, as model selection 
was not based on P values. Results using 9 or 14 functions were qualitatively the 
same and therefore only the former are shown here. 

Review of multitrophic manipulative approaches. Manipulative experiments 
including as many groups and services as we considered in this study do not yet 
exist. However, we compared our correlational results with available evidence 
from experiments manipulating the diversity of more than one trophic group. 
To do this we performed a bibliographic research in the Web of Knowledge and 
in Google Scholar using all combinations of the terms ‘multitrophic or ‘trophic 
groups + ‘functioning’ or ‘multifunctionality’ or ‘biomass’ or ‘ecosystem services’ 
or ‘diversity. We also screened references within available reviews on multitrophic 
diversity-ecosystem functioning relationships'®’”°, Of the papers found, we 
selected those which fulfilled the following criteria: i) it was a manipulative study, 
ii) it included a range in species richness (not only presence or absence) of, at least, 
two different trophic groups and iii) it provided enough information to calculate 
the increase in variance explained by the addition of a second trophic group. Only 
four studies, including seven ecosystem functions, fulfilled these criteria (Extended 
Data Table 1). Some of these manipulative studies did not include plants, so we 
calculated the percentage increase in variance seen when comparing a model with 
the trophic group that had the strongest explanatory power in models containing 
two trophic groups. When the same function was measured across several studies 
(that is, biomass), we calculated the average increase in variance explained for 
this variable when another trophic level was added. These results were used to 
qualitatively compare the limited evidence from multitrophic manipulations with 
our results. 
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Extended Data Figure 1 | Functional effects of multitrophic richness 
and abundance on 14 grassland ecosystem services. a, Standardized 
coefficients (mean + s.e.m.) of the abundances (triangles) and richness 
(circles) of those trophic groups that significantly affect a given function 
are shown. b, The net effect (that is, the sum of significant standardized 
effects). c, Difference in adjusted R? between the final multitrophic 
models and those models using the abundance and richness of the best 
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performing individual trophic group (unitrophic) or plant species richness 
(plant richness). Ecosystem services are organized by the main four types 
of services they associate with (provisioning, supporting, regulating 

and cultural). The number of trophic groups included in the most 
parsimonious model is given next to their adjusted R’. Multifunctionality 
results at 25%, 50%, 75% and 90% thresholds are also shown 

(see Methods). 
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functions. a, Standardized slope estimates (mean + s.e.m.) for each to the adjusted R’ value. The increase in the adjusted R? values in models 
significant predictor are shown, with the exception of study region and soil _ with plant-species-richness averaged 0.07 + 0.12 (across functions) and 
type, which were retained in all models. b, Net effect (sum of significant 0.06 + 0.11 (across functions and multifunctionality indices). Ecosystem 
standardized effects) for multitrophic richness and abundance. c, The services are organized by the main four types of services they associate 
total amount of variance explained by either environmental + plant with (top—bottom: provisioning, supporting, regulating and cultural). 
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individual trophic predictor, or by environmental + multitrophic diversity _ the slope, and the inclination of the slope. Multicollinearity between the 
and abundance are shown for each function (adjusted R?, to control predictors introduced is unlikely (Extended Data Table 3). 
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POLLINATOR ABUNDANCE PEST CONTROL RESISTANCE TO PATHOGENS SOIL ORGANIC C 
Environment Richness Environment Richness Environment —_ Richness Environment —_ Richness 
& =0.12 é = 0.18 r. = 0.22 Biota = 0.06 
Abundance Abundance Abundance Abundance 
Residual = 0.61 Residual = 0.63 Residual = 0.48 Residual = 0.13 
FLOWER COVER BIRD RICHNESS MULTIFUNCTIONALITY 25% MULTIFUNCTIONALITY 50% 
Environment Richness Environment — Richness Environment _ Richness Environment Richness 
& anes *~ = 0.47 & = 0.07 Biota = 0.07 
Abundance Abundance - Abundance Abundance 
Residual = 0.37 Residual = 0.36 Residual = 0.65 Residual = 0.66 
MULTIFUNCTIONALITY 75% MULTIFUNCTIONALITY 90% 
Environment _ Richness Environment Richness 
& =0.15 & =0.17 
Abundance Abundance 
Residual = 0.64 Residual = 0.73 


Extended Data Figure 5 | Functional importance of species richness 
and abundance compared to environmental drivers. Venn diagrams 
showing the variance partition for the four components of our statistical 
models (environment: climate, soil and land-use intensity; species richness 
of the nine trophic groups, abundance of primary producers, above- and 


below-ground predators, below-ground herbivores and soil microbial 
decomposers). The variance not explained by the model (the residual) 
is also shown. The variance explained by richness, abundance and their 
overlap is summed up as Biota. Each panel represents an individual 
function or multifunctionality metric. 
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Extended Data Figure 6 | Functional effect of the different trophic 


groups. Overall functional effects (mean 4 
slopes obtained from the model; with the exception of a, where error could 
not be estimated) of the richness (open bars) and abundance (hatched 
bars) of each group. a, The values were calculated after weighting each 
standardized coefficient (those in Extended Data Fig. 1) by the adjusted 


t s.e.m. of the standardized 


R’ of the model to account for differences in model performance. 


8 p=0.90 - |p =0.90 ; 

E ° 

a aoe ee ae 
2 9 =0.70 ‘ 
E eee? 
Oo e 


b, c, The values were calculated as the standardized coefficients in a 
general model fitted to all services at once, including ‘service identity’ 

as an extra predictor and ‘plot’ as random factor to control for pseudo- 
replication (reduced models (b); the ones presented in the main text), 

or full models (c) and, d, calculated as multi-model average parameters 
from a model fitted to all services at once. Correlations (Spearman's rank 
correlation coefficients) between the different approaches are given. 
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Extended Data Table 1 | Re-analysis of manipulative multitrophic studies 


ao) Qa 2 
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Da, 2 2 2 $ a & #5 = 
3 é 2 3 a Z & ‘3 £ = 
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B 4 = Ep 4 g 5 oF? 8 
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th ~ > 
1 Douglass et al. 2008 Ecolett aquatic 2 mesocosm grazer abundance 55.0 61.0 based on w? 
1 Douglass et al. 2008 Ecolett aquatic 2 mesocosm predator abundance 5.0 7.0 based on w 
2 Bruno et al. 2008. Ecology aquatic 2 mesocosm autotroph biomass 19.5 42.7 based on F 
3 Naeem et al. 2000. Nature terrestrial 2 microcosm autotroph biomass 13.6 23.5 based on F 
3 Naeem et al. 2000. Nature terrestrial 2 microcosm detritivore biomass 5.6 11.3 based on F 
3 Naeem et al. 2000. Nature terrestrial 2 microcosm # C sources used 7.2 13.5 based on F 
4 Handa et al. 2014 Nature both 2 field.expt Litter C loss 5.8 6.6 based on %SS 


For each study, an ID number and full reference are given. The system in which each study was performed (aquatic or terrestrial), the number of trophic groups manipulated and the approach 
used (controlled mesocosms or field studies) are provided. The ecosystem functions (‘response variable’) measured within each study were grouped in biomass production (the first five rows), 
nutrient cycling (sixth row) and decomposition (seventh row). Variance explained (according to the statistic mentioned in comments; w? = proportion of variance explained according to the authors; 


F=Fisher’s F, SS=sum of squares) for the single trophic group with the most explanatory power, and the difference between the variance explained by this group and the inclusion of a second group 
are given (grey column). 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


Extended Data Table 2 | Details of the sampling procedure for each trophic group and function 


Detritivorous insects 


Trophic group Subgroup Sampling method Author 
Primary producers Plants, bryophytes Measurement of % cover ina 4x4 m subplot, done in 2009 Boch, Heinze, Holzel, Klaus, Kleinebecker, Muller, Prati, Socher, Fischer 
: cs . Sweep netting (Hemiptera: Heteroptera/Auchenorrhyncha, Hymenoptera, Neuroptera and Bek Ts 
Aboveground herbivores Herbivorous insects Orthoptera). Transects of 150m with 60 double sweeps, done twice per plot in 2008-2010 Lange, Pagalié, Tiirke, Gossner, Weisser 
Convo tassels Sweep netting (Hemiptera: Heteroptera/Auchenorrhyncha, Hymenoptera, Neuroptera and Lange, Pagalig, Turke, Gossner, Weisser 
Ab d oredato Orthoptera). Transects of 150m with 60 double sweeps done twice per plot in 2008-2010. 
loveground pregalors: Spiders Sweep netting. Transects of 150m with 60 double sweeps, done twice per plot in 2008-2010. Lange, PaSali¢, Turke, Gossner, Weisser 
Chilopoda. Kempson extraction from one soil core of 20 x5 cm per plot, done in 2008 Birkhofer, Diek6tter, Wolters 
Annelids Hand sorting from two soil cores of 20 «10 em per plot, done in 2008 Birkhofer, Diekétter, Wolters 
Dekilivares Diplopoda Kempson extraction from one soil core of 20 x5 cm per plot, done in 2008 Birkhofer, Diekétter, Wolters 


Sweep netting (Hemiptera: Heteroptera/Auchenorrhyncha, Hymenoptera, Neuroptera and 
Orthoptera). Transects of 150m with 60 double sweeps, done twice per plot in 2008-2010. 


Lange, PaSalié, Tiirke, Gossner, Weisser 


Microbial decomposers Soil bacteria cDNA amplicon sequencing of partial (V3) 165 rRNA gene transcripts, done in 2011 Baumgartner, Sikorski, Overmann 
Bactenvores Bacterivarous protists 18S rDNA gene PCRand amplicon sequencing (454) filtering for rhizarians, alveolates, Venter, Amdt 

stramenopiles and opisthokonts, done in 2011 
Symbionts Arbuscular mycorrhizal fungi__| Pyrotag sequencing of the NS31 - AM] fragment of the 18S rDNA genes, done in 2011 Klemmer, Wubet, Buscot 

‘ Extracted from a heat/moisture gradient in one soil core of 20x 5 cm per site, done in 2011 

Belowground herbivores Insect larvae : fi Sonnemann, Wurst 

over a period of eight days. 

= - 7 7 —> 

Belowground predators Thssét‘laivee eee nom a heat/moisture gradient in one soil core of 20 x 5 cm per site, done in 2011 Sonnemann, Wurst 
Function Sampling method Author 


Aboveground plant biomass 


Harvested in four 0.5 m * 0.5 m quadrats per plot, done in May-June in 2008-2012. 


Schmitt, Prati, Fischer, Klaus, Kleinebecker, Hélzel 


Belowground plant biomass 


Measured in 14 soil cores (0-10 cm). Fine roots were sorted and weighted after drying in the 
oven, done in samples collected in May 2011 


Solly, Schéning, Schrumpf 


Root decomposition rate 


Measured as the mass loss from root litter bags after 6 months, from October 2011 to April 
2012. 


Solly, Schéning, Schrumpf 


Potential nitrification 


10 mM ammonium sulphate solution was added as substrate to 2.5g of soil composite samples 
(i.e. the same samples as for soil carbon; see below). 


Stempfhuber, Schloter 


Phosphorus uptake and retention 


Proportion of P in plants and microbes (shoot P stock + microbial P stock) / (shoot P stock + 
microbial P stock + soil extractable P [NaHCOs]). 


Alt, Sorkau, Oelman, Wilcke, Klaus, Kleinebecker, Hélzel 


Arbuscular mycorrhizal fungal root colonization 


Stability of soil aggregates 


Cultured in sterile soil in the field from April to October 201] and then extracted with sodium 
hexametaphosphate (35 g I"!). Hyphal length was quantified after staining with trypan blue. 
A subsample of the same soil than above (AMF colonization) was passed through a 250 um 
sieve under water to determine the percentage of water stable macroaggregates. 


Morris, Rillig 


Morris, Rillig 


Soil organic Carbon 


Measured in 14 soil cores (0-10 cm). Calculated as the difference between total carbon 
(measured with a CN analyzer “Vario Max” [Elementar Analysensysteme GmbH, Hanau, 
Germany ]) and inorganic carbon (determined after combustion of organic carbon in a muffle 
furnance; 450°C for 16h), done in samples collected in May 2011 


Schéning, Solly, Schrumpf 


Forage quality 


Was calculated as a function of mean of scaled crude protein concentration and scaled relative 
forage value, done in May-June in 2008-2012. 


Klaus, Kleinebecker, Hélzel 


Resistance to plant pathogens 


Pest control 


Pollinator abundance 


Calculated as the inverse of the total cover of foliar fungal pathogens. The cover of pathogens 
was measured in four 25 = 1 m transects per plot, were proportion of plants infected, and leaf 
area infected of these individuals was measured; done in October 2011, 

Number of trap nesting wasps known to feed on pest insects, done between April and October 
2008. 

Estimated as the total abundance of flower visitors, measured in one 200 3 m transect per 
plot, done in May 2008 


Blaser, Prati, Fischer 


Steckel, Westphal, Steffan-Dewenter 


Krauss, Klein, Weiner, Werner, Blathgen 


Bird diversity 


Measured as the cumulative species richness estimated by audio-visual point-counts, done in 
May-June 2008-2010 


Renner, B6hm, Tschapka 


Flower cover 


Measured as the number of inflorescences in four 50 < 3 m transects per plot. Flower area for 
each species was obtained from the literature. 


Binkenstein, Schaefer 


acteria and mycorrhiza). 


Note that for some groups the taxonomic unit was either operational taxonomic units (OTU: fungi and protists) or families (bacteria and below-ground insect larvae). Abundance measures were: 
per cent cover (plants, bryophytes), number of individuals captured (arthropods) and relative proportion of sequence reads assigned to each family among all reads within each plot (protists, soil 
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Extended Data Table 3 | Correlations between diversity predictors from the models in the main text 


s 2 

8 ¢ : : 5 8 § 5 é ¢ 

$s = 3 = 

5 3 € 8 3 g § 3 € 8 3 
5 z 3 a 2 5 2 3 a 2 
eS FE! FESSEETEEEE 
e 6 & § g S&F ££ Fs es 2 Bw F 
z= i: H gee Pee 
3 2 & 5 — 3 & 8 & 8 = 
Primary producers 0.06 0.09 0.03 -0.07 -0.01 0.01 -0.13 0.05 0.06 0.06 -0.05 0.05 -0.06 0.10 -0.02 -0.04 -0.16 
Belowground herbivores 0.21 0.08 0.19 0.00 018 -0.02 0.02 0.15 0.37 0.10 0.13 0.08 0.06 0.18 -0.05 0.10 
Belowground predators 0.14 -0.16 0.05 009 -0.23 0.04 016 022 054 0.12 0.09 020 0.15 0.09 012 
8 Detritivores 0.12 -0.10 -0.02 020 -0.02 0.04 0.15 0.0sf 0.74] 0.12 -0.13 0.02 0.03 -0.15 
3 Aboveground herbivores 0.01 016 0.32 -0.05 -0.05 0.05 0.08 0.22 054 -0.04 0.14 0.34 -0.06 
2 Soil microbial decomposers 0.27 -0.02 0.05 0.00 0.01 0.04 0.05 -0.11 0.17 025 -0.05 0.08 
Bacterivores -0.13 0.06 0.24 0.11 0.12 0.00 -0.10 0.13f 07q-0.14 0.25 
Aboveground predators 0.04 0.14 0.02 011 0.14 038 -0.16 018 059 -0.18 
Plant symbionts 0.06 0.03 -0.10 0.04 -0.12 0.03 -0.06 0.03 0.46 
Primary producers -0.13 -0.13 -0.09 0.16 038 -026 0.02 -0.22 
Belowground herbivores 0.24 0.18 011 -0.03 0.09 0.11 0.13 
., Belowground predators 0.08 0.13 0.19 0.25 -0.02 -0.07 
3 Detritivores 0.19 -0.12 0.02 0.09 -0.08 
é Aboveground herbivores 0.02 -0.03 0.48 -0.15 
= Soil microbial decomposers 0.21 0.06 0.17 
Bacterivores 0.17 021 
Aboveground predators -0.17 


Soil microbial decomposers 


Soil microbial decomposers 


Belowground herbivores 
Belowground predators 
es 
nd 
Belowground herbivores 
Aboveground herbivores 
Aboveground predators 


Bacterivores 


S]Belowground predators 
Detritivores 


S|Fertilisation 
ta|Primary producers 
S)Plant symbionts 


r) 
g 
=] 


pH 0.22 -0.09 012 -0.21 -0.40 030 025 -003 0.08 -018 0: 
~ Soil depth 021 0.02 052 -0.28 -0.22 0.11 022 -0.29 -0.15 -0.06 0.10 060 0.62 -0.16 
: Fertilisation -0.22 0.00 0.01 -0.06 0.22 0.06 -0.06 0.00 010 001 -0.16 -0.23 -0.04 
g 


i 


0.21 -0.07 -0.05 -0.35 0.21 -0: 
0.26 0.39 042 0.10 0.02 037 
0.13 -0.06 -0.08 -0.01 0.10 -0.06 
Mowing 0.05 0.18 -0.03 -0.09 0.19 0.09 -0.15 0.17 -0.01 -0.04 -0.12 -0.40 -0.13 0.10 0.09 -0.24 -0.02 -0.04 0.17 -0.16 
Grazing 0.09 -0.13 0.01 0.04 -0.02 004 016 026 027 OO1 O11 013 012 014 -004 0.11 007 032 0.12 025 
Elevation 
TMi 


a8 


055 031 018 0.11 028 029 022 0.05 -0.04 -061 060 019 -020 -033 0.39 -0.32 -0.13 0.10 -0.42 
0.19 -0.20 -0.02 0.15 -028 -0.36 0.19 009 035 -050 -011 019 0.27 0.28 041 -0.08 0.14 0.10 


Primary producers 0.06 0.13 0.06 0.22 0.06 -0.08 -0.27 0.21 0.12 0.06 0.15 -0.11 0.10 -0.10 -0.13 0.14 -0.41 
Belowground herbivores 0.26 0.02 0.11 0.26 028 -0.05 -0.08 0.19 048 0.08 -0.03 0.22 -0.09 024 0.12 0.20 
Belowground predators 0.09 -0.01 0.08 014 -0.19 -0.09 -0.08 024 051 0.06 0.09 0.09 0.13 0.13 0.05 
2 Detritivores 0.02 -0.15 -0.05 0.16 0.15 0.16 0.09 0.01 -0.06 0.02 0.08 0.03 
Aboveground herbivores 0.07 0.06 0.23 -0.25 0.17 0.05 -0.06 -0.14 0.03 0.31 0.29 
Soil microbial decomposers 047 0.18 0.05 0.17 0.17 0.01 -0.21 0.09 -0.11 046 0.18 0.32 
Bacterivores 0.17 0.05 0.07 015 0.12 -0.08 -0.07 0.08 0.20 0.40 
Aboveground 
0.38 0.01 0.06 0.19 0.29 028 0.13 -0.05f 0.6 
Primary producers 0.14 -0.22 -0.24 0.42 -0.31 0.13 0.14 0.14 
Belowground herbivores 0.22 0.08 0.18 -0.08 0.11 0.03 0.19 
4 Belowground predators 0.17 -0.01 0.27 0.20 0.02 0.12 
Detritivores 0.04 -0.01 -0.05 0.15 0.08 
Aboveground herbivores -0.17 -0.09 0.41 -0.22 
= ‘Soil microbial decomposers 0.14 0.04 0.24 
Bacterivores 0.18 0.46 
predators -0.09 


Correlations between residuals (after controlling for the effect of study region, soil type, pH, topographic wetness index and the three land-use intensity components: fertilization, mowing and grazing) 
of abundance and species richness of the nine different trophic groups considered (top) or of the raw data (bottom). Those predictors removed owing to multicollinearity problems are shaded grey, 
with the correlation responsible highlighted. TWI =topographic wetness index, obtained from P. M., unpublished data. 
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Extended Data Table 4 | Model selection 


Biomass 
Forage quality 
Potential nitrification 
Root biomass 
Root decomposition 
Phophorus retention 
Soil.aggregate stability 
Soil C 
Pest control 
Resistance to pathogen 
Pollinator abundance 
Bird diversity 
Flower cover 
Multifunctionality 25% 
Multifunctionality 50% 
Multifunctionality 25% 
Multifunctionality 90% 
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Primary producers 


Belowground herbivores —-1.85 -1,95 
¢ Belowground predators -1.90 -0.06 
~ Aboveground herbivores —-0.74 -1.74 
£ Soil microbial 
< decomposer -1.57 -1.80 
Aboveground predators -1,28 
Plant symbionts -1.83  -2.00 


Primary producers 
Belowground herbivores 
Belowground predators 


1.74 
-2.00 
-159 -0.04 -1.84 


-197 -056 -183  -165 


“176-153 -1.47 | 004) 


“ Detritivores -1.32  -1.68 -1.63 -1.32 -1.70  -1.64 

& Aboveground herbivores -1.94 -0.65 -1.57 -1,49 -1.98 -1,98 

Soil microbial 

> decomposer 137 -199 -154 -166 -166 -163  -1.92 -0.43 -0.77 
Bacterivores -0.86 -1.91 HA) -187 -190 -156 -1.33 -105 -1.90 
Aboveground predators -1.93  -1.43 -0.75 -1.38 -188 -134 -053 -2.00 -0.48 -1.60 


Plant symbionts -1.87  -1.91 


Difference in AIC when subtracting each term regarding the full model according to the backward step AIC procedure used (models using the environmental-corrected residuals, as presented in Fig. 1 
and Extended Data Fig. 1). Green shade indicates the terms included in the most parsimonious models. Orange shade indicates terms included in the model with the lowest AIC but further removed 
using F-ratio tests. 
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The TRPM2 ion channel is required for sensitivity to 


warmth 


Chun-Hsiang Tan! & Peter A. McNaughton! 


Thermally activated ion channels are known to detect the 
entire thermal range from extreme heat (TRPV2), painful heat 
(TRPV1, TRPM3 and ANO1), non-painful warmth (TRPV3 and 
TRPV4) and non-painful coolness (TRPM8) through to painful 
cold (TRPA1)!~’. Genetic deletion of each of these ion channels, 
however, has only modest effects on thermal behaviour in mice™ , 
with the exception of TRPM8, the deletion of which has marked 
effects on the perception of moderate coolness in the range 
10-25°C?. The molecular mechanism responsible for detecting 
non-painful warmth, in particular, is unresolved. Here we used 
calcium imaging to identify a population of thermally sensitive 
somatosensory neurons which do not express any of the known 
thermally activated TRP channels. We then used a combination of 
calcium imaging, electrophysiology and RNA sequencing to show 
that the ion channel generating heat sensitivity in these neurons 
is TRPM2. Autonomic neurons, usually thought of as exclusively 
motor, also express TRPM2 and respond directly to heat. Mice in 
which TRPM2 had been genetically deleted showed a striking deficit 
in their sensation of non-noxious warm temperatures, consistent 
with the idea that TRPM2 initiates a ‘warn’ signal which drives cool- 
seeking behaviour. 

Previous studies have described novel heat-sensitive neurons 
not activated by agonists for any of the known heat-sensitive ion 
channels*!*"!*. We identified these neurons in cultures from dorsal 
root ganglia (DRG) by using calcium imaging and selective agonists for 
known thermo-TRP ion channels. We found that around 10% of neu- 
rons responded to heat, but were not activated by any of the agonists for 
known TRP channels (Fig. la~c and Supplementary Video 1). Changing 
the order of application of agonists or using a lower starting tempera- 
ture had little effect on this proportion (Extended Data Figs 1 and 2). 
Novel heat-sensitive neurons were found to be activated over a wide 
range of temperatures, with a subset activated in the range of warm tem- 
peratures between 34°C and 42°C, suggesting a possible role in warmth 
sensation, and a second group activated only at higher temperatures 
(Fig. 1d, e). Novel heat-sensitive neurons were significantly larger than 
either TRPV 1-expressing or TRPM3-expressing neurons (Extended 
Data Fig. 3). We investigated the possibility that the novel heat- 
sensitivity may be co-expressed with TRPV1 and TRPM3 by blocking 
these channels with antagonists before applying heat; the higher propor- 
tion of neurons responding (46% in Extended Data Fig. 4a, compared 
to <10% expressing only the novel heat-sensitive mechanism, see Fig. 1 
and Extended Data Figs 1, 2 and 4b) shows that there is significant 
co-expression of the novel heat-sensitive mechanism with TRPV1 and 
TRPM3. The phenotype of neurons expressing the novel heat-sensitive 
mechanism was investigated by identifying novel heat-sensitive neu- 
rons in the presence of TRP channel blockers (Extended Data Fig. 5a, 
red) and then exposing them to IB4, which marks a non-peptidergic 
neuronal population (Extended Data Fig. 5b, green). These results show 
that the novel heat-sensitive mechanism is predominantly expressed in 
IB4* neurons (74% IB4*). 


We next sought to identify a source of neurons which expresses a 
less-complex set of heat-sensitive ion channels than are present in 
DRG neurons. In isolated sympathetic neurons from the superior 
cervical ganglion (SCG) we found that no neuron showed an increase 
in intracellular calcium concentration ([Ca**];) in response to agonists 
of known thermo-TRP channels, but that 58% showed a significant 
response to heat (Fig. 2a and Extended Data Fig. 6). Similar results 
were obtained in parasympathetic neurons isolated from the pterygo- 
palatine ganglion (PPG), in which 49% of neurons showed novel heat 
sensitivity (not shown). Autonomic neurons therefore express the novel 
heat-activated ion channel in isolation, which offers an advantageous 
preparation for determining its properties. 

We found that the heat-activated calcium increase in autonomic 
neurons was due to an influx of Ca?* from the external solution, and 
that it was reduced but not abolished by removal of external Na* 
(Extended Data Fig. 6c). The TRPV channel family activator 2-APB!” 
suppressed the heat-activated Ca?* influx, and the TRPV blocker 
ruthenium red? had no effect (Extended Data Fig. 6d, e), making it 
unlikely that the novel heat-activated mechanism is a TRPV family 
member. The Ca’* influx was unaffected by the voltage-dependent Na 
channel blocker tetrodotoxin (TTX) at a high enough concentration to 
block the TT X-insensitive Na* channels Nay1.8 and Nay1.9 (Extended 
Data Fig. 6f). L-type Ca** channel blockers prevented firing of action 
potentials during simultaneous recordings of membrane voltage and 
intracellular calcium imaging, but did not completely block the heat- 
activated Ca’* influx (Fig. 2b; see also Extended Data Fig. 6g). Together 
these observations show that the heat-sensitive ion channel is perme- 
able both to Ca** and to Na‘, and that when the channel is opened by 
heat, the resulting depolarization activates L-type calcium channels 
and thus augments the calcium influx. The fact that membrane current 
through the novel heat-activated ion channel is carried by Na* and 
Ca*+ makes it unlikely that the channel is ANO1, which is permeable 
to chloride ions’. 

We next investigated the voltage- and time-dependent behaviour of 
the novel heat-activated ion channel, isolated by suppressing voltage- 
dependent calcium, sodium and potassium currents’. The current-voltage 
relation was approximately linear, with a reversal potential close to 
zero (Fig. 2c), and the channel showed no time-dependent gating by 
membrane voltage (Extended Data Fig. 7). Interestingly, activation of 
the channel by mild warmth was strongly potentiated by hydrogen 
peroxide, both in autonomic neurons (Fig. 2d) and in DRG neurons 
(Extended Data Fig. 4c, d). 

To identify the ion channel responsible for the novel heat sensitivity 
we carried out an analysis of mRNA expression by the use of RNA 
sequencing (RNA-seq). We investigated two cell lines, MAH cells'® 
and PC12 cells, both derived from rat adrenal cells, which share many 
properties with sympathetic neurons. Like primary autonomic neurons, 
no cell in either line responded to agonists of any of the conventional 
thermo-TRP channels, but a significant fraction of cells responded to 
heat with thermal thresholds similar to those in somatosensory neurons 
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Figure 1 | Around 10% of somatosensory neurons demonstrate a novel 
heat-sensitive response. a, Examples of increases of [Ca**]; measured with 
fura-2 (F 340/30 ratio, ordinate) in response to known TRP channel agonists 
and to heat. From top: TRPV 1-expressing neuron responding to capsaicin 
(caps, red); TRPM3-expressing neuron responding to pregenolone 

sulfate (PS, blue); TRPV1-and TRPM3-co-expressing neuron (brown); 
neuron unresponsive to TRP channel agonists but showing a response 

to heat (bottom trace). See Supplementary Information Video 1. 

b, Percentages of neurons expressing TRPV1, TRPM3 and the novel heat- 
sensitive mechanism in DRG neurons from wild-type (WT) (top) and 
Trpm2—' ~ knockout (KO) mice (bottom). Green: neurons responding to 
2-APB but not to other agonists. c, Histogram of maximal F340/3g9 ratio 
increase in neurons insensitive to TRP agonists. Black, wild-type neurons; 
red, Trpm2~/~ neurons. Vertical dashed lines: thresholds discriminating 
between heat-sensitive and insensitive neurons (see Extended Data Fig. 1). 
d, Thermal thresholds of novel heat-sensitive DRG neurons. Increases 

of F340/3g0 ratio (top) in response to slowly rising heat ramp (bottom). 

e, Temperature thresholds of novel heat-sensitive neurons from wild-type 
(black) and Trpm2-'~ mice (red). Proportion in range 34-42 °C reduced 
from 4.4% in wild type to 0.9% in Trpm2~/~ (P < 0.0001; Fisher’s exact 
test), and in range 42-48 °C from 15% in wild type (290/1,890) to 3% in 
Trpm2-'~ (55/1,800) (P< 0.0001; Fisher’s exact test). Cell numbers and 
replicates for a—c were 1,324 DRG neurons from one wild-type mouse 

on 4 coverslips and 981 DRG neurons from one Trpm2~/~ mouse on 

4 coverslips were imaged. Similar results obtained using 52 additional 
coverslips from 9 additional wild-type mice and 10 coverslips from 

3 additional Trpm2~! ~ mice; some of these results are shown in Extended 
Data Figs 1, 2 and 4. Cell numbers and replicates for d and e were 1,890 
DRG neurons from one wild-type mouse on 8 coverslips and 1,800 DRG 
neurons from one Trpm2~'~ mouse on 8 coverslips were imaged. Similar 
results obtained using 10 coverslips from one additional wild-type mouse. 


(Extended Data Fig. 8). We also found that the fraction of heat-sensitive 
neurons in both lines was reduced by differentiation to a neuronal-like 
phenotype (Extended Data Fig. 8, red bars). 

The properties of mixed Na*/Ca”* permeability, a reversal potential 
near 0mV and absence of time-dependent gating by membrane poten- 
tial (see earlier) are consistent with a member of the large TRP and CNG 
ion channel families. RNA-seq analysis of the MAH cell line showed 
that detectable mRNA was present only for the seven TRP channels 
shown in Table 1, out of all TRP and CNG channels. TRPC3, TRPV2, 
TRPM4 and TRPM7 are unlikely candidates for the warmth-sensitive 
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Figure 2 | Properties of the novel heat-sensitive ion channel in 
autonomic neurons. a, Sympathetic neurons from superior cervical 
ganglion (SCG) respond to heat but not to TRP channel agonists. b, The 
L-type Ca** channel blocker nifedipine (101M) blocks spiking but not 
steady depolarization in response to heat in a patch-clamped PPG neuron 
(top), and reduces, but does not eliminate the Ca”* increase (middle). 
Simultaneous recording of membrane potential and [Ca?*]; (F340/380 ratio) 
in current-clamped PPG neuron. c, Current—voltage relations at 36°C 

and 47 °C of voltage-clamped PPG neuron in response to voltage ramp 
shown in the top left. Similar I/V relation observed with inverse voltage 
ramp starting from —100 mV (Extended Data Fig. 7). See Methods for 
details. d, HO. (400,1M) potentiates Ca?* increase in response to a mild 
temperature stimulus (42°C) in SCG neurons. Note potentiation is long- 
lasting after HO is removed. Similar results obtained in PPG neurons. 
Percentage of neurons (n = 456) responding to heat, determined as in 
Extended Data Fig. 1b, was 5% before, 59% during and 54% post-H2O>, 
respectively. Cell numbers and replicates for a were 166 SCG neurons 
from 3 wild-type mice on 3 coverslips imaged for this experiment. Similar 
results obtained with SCG neurons using 15 additional coverslips from 9 
additional mice and with PPG neurons using 14 coverslips from 

9 additional mice. Cell numbers and replicates for b were 2 PPG neurons on 
2 coverslips simultaneously patch-clamped and imaged and these showed a 
similar response. Cell numbers and replicates for c were 3 PPG neurons on 
3 coverslips simultaneously patch-clamped and imaged and these showed a 
similar response. Cell numbers and replicates for d were 456 SCG neurons 
from two wild-type mice on 4 coverslips imaged and these showed a 
similar enhancement with H,O,; 731 PPG neurons (not shown) from 
three wild-type mice on 3 coverslips were imaged and showed a similar 
enhancement to the SCG neurons. Similar results obtained with SCG 
neurons using 9 coverslips from 5 additional mice and with PPG neurons 
using 3 coverslips from 3 additional mice. 


channel as all have nonlinear current-voltage (I/V) relations!’, whereas 
the warmth-sensitive ion channel is linear (Fig. 2c). TRPC1 and TRPV2 
are strongly upregulated by culture in differentiation medium (Table 1), 
which contrasts with the downregulation observed for the heat-sensitive 
ion channel (Extended Data Fig. 8a). TRPV2 is activated only by 
extreme heat”? and TRPM4 is calcium-impermeable”!, neither of which 
is consistent with the properties of the novel heat-sensitive ion channel 
(see above). TRPC2 is a pseudogene in primates and seems unlikely to 
play an important role in behavioural warmth sensation, a fundamental 
property in all mammals. This leaves only TRPM2, which has a linear 
1/V relation (see Fig. 2c)”, is activated at temperatures above 35°C” 
and is expressed in DRG neurons”*“, as the most probable candidate 
for the warmth-sensitive ion channel. Thermal activation of TRPM2 
is enhanced by hydrogen peroxide” and the channel is blocked by the 
chemical 2-APB”®, both of which are characteristics shared by the novel 
heat-sensitive ion channel (Fig. 2d and Extended Data Figs 4c, d and 6d). 
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Table 1 | mRNA levels of TRP ion channel genes in MAH cells 


FPKM in growth FPKM in Loge fold 
Gene medium differentiation medium change Pvalue qvalue 
Trpcl 1.779 4.700 1.401 <0.001 <0.001 
Trpc2 6.804 4.995 -0.446 0.121 0.323 
Trpc3 5.641 1.309 —5.729 <0.001 <0.001 
Trpv2 0.035 4.003 6.840 <0.001 <0.001 
Trpm2 1.017 0.970 —0.069 0.800 0.922 
Trpm4 0.277 0.400 0.532 0.112 0.310 
Trpm7 18.108 15.428 —0.231 0.196 0.428 


All genes within the TRP channel family and the six cyclic nucleotide-regulated channels (Cnga1, 
Cnga2, Cnga3, Cnga4, Cngb1, and Cngb3) were examined. The seven genes listed are the ones 
whose mRNA levels are detectable in MAH cells in either condition. FPKM, fragments per kilobase 
of transcript per million mapped reads. The P values represent significance of change of 
expression using one-way ANOVA; q values are post-hoc corrected for multiple testing using the 
false discovery rate (FDR) method. Two biological replicates in each condition were deep-sequenced. 
Growth medium contained dexamethasone (51M), differentiation medium contained the growth 
factors bFGF (10ng ml-!), CNTF (10ng ml-!) and NGF (50ng mI-!). 


We confirmed the identity of the novel heat-sensitive ion channel 
using calcium imaging experiments on neurons from Trpm2-'~ 
mice’””®, The proportions of DRG neurons responding to TRP chan- 
nel agonists were similar to those in wild-type neurons (Fig. 1b and 
Extended Data Figs 1c and 2c), but the proportion of novel heat-sensitive 
neurons was significantly reduced, from ~10% to around 3% (Fig. 1c, 
red bars; P < 0.0001, Fisher's exact test, see also Extended Data Figs 1, 2 
and 4). Moreover, the mean amplitude of the increase in the fura-2 
fluorescence ratio at 340 and 380 nm (an index of the increase in intra- 
cellular calcium) in Trpm2~/~ neurons not responding to TRP channel 
agonists was greatly reduced, from F349/3g9 = 2.336 + 0.2060 in wild type, 
to 0.6202 0.2435 in Trpm2-‘~ (Fig. 1c; P< 0.0001, two-tailed unpaired 
t-test). Notably, the sensitivity of the heat response to hydrogen perox- 
ide was abolished by deletion of Trpm2 (Extended Data Fig. 4c, d). In 
experiments on the thermal thresholds of novel heat-sensitive neurons 
(Fig. 1d, e), deletion of Trpm2 abolished almost all thermal sensitivity 
in the range 34-42 °C (reduced from 4.4% in wild type to 0.9% in 
Trpm2~! ~, P< 0.0001; Fisher’s exact test), though some neurons still 
responded to temperatures in the noxious thermal range (reduced from 
15% in wild type to 3.1% in Trpm2-"-, P<0.0001; Fisher’s exact test). 
We also considered the possibility that TRPM2 may contribute to the 
heat response in DRG neurons in which it is co-expressed with TRPV1 
and/or TRPM3. Deletion of Trpm2 was found not to affect the maximum 
amplitude of the response to heat in TRPV1-expressing neurons, but it 
did reduce the response of TRPM3-expressing neurons (Extended Data 
Fig. 9), suggesting that co-activation of TRPM2 and TRPM3 is important 
in determining the heat responses of these neurons. Finally, in SCG neu- 
rons from Trpm2~'~ mice, both the numbers and response amplitudes 
of heat-sensitive neurons were greatly reduced (Extended Data Fig. 6b). 

Expression of mRNA that encodes TRPM2 was demonstrated in 
heat-sensitive DRG neurons using in situ hybridization. Trpm2 mRNA 
was surprisingly abundant in neurons, being expressed in 89% of 
DRG neurons, but was expressed in, at most, a very small minority 
of glial cells (Extended Data Fig. 10). In prior tests for heat sensitivity 
in the same neurons, 42% of neurons positive for Trpm2 mRNA had 
responded when TRPV1 and TRPM3 were blocked. The reason why 
not all neurons positive for Trpm2 mRNA respond to heat is not clear, 
but could be due to a low conversion of mRNA into TRPM2 protein 
or to low trafficking to the membrane in around half of DRG neurons. 
However, many fewer neurons responded to heat in the population not 
expressing mRNA for Trpm2 (13%, see Extended Data Fig. 10), which 
is consistent with the small number of heat-sensitive neurons still seen 
in DRG cultures from Trpm2~/~ mice (see for example, Fig. 1b). 

Finally, we investigated whether genetic deletion of Trpm2 has an 
impact on thermal preference in mice. The most striking behavioural 
difference was that wild-type mice avoided the non-noxious warm 
temperature of 38°C, whereas Trpm2'~ mice showed little preference 
(Fig. 3c, e and Supplementary Video 2). The difference became much 
less noticeable at 43 °C, when noxious-heat avoidance mechanisms 
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Figure 3 | Deletion of TRPM2 shifts adult male mouse thermal 
preference towards warmer temperatures. a—d, Two-plate thermal 
preference test. One plate is at 33°C at the start of the experiment, the 
other (‘plate A’) is at a variable temperature as shown above each graph. 
Temperature reversed at t= 30 min to account for any possible bias caused 
by external cues. Points show mean behavioural preference averaged over 
5 min intervals (error bars represent the mean +s.e.m., n= 11 for wild 
type and 9 for Trpm2~'~). Mice were wild-type or Trpm2~'~ (KO) male 
littermates, aged 12-16 weeks of age. No difference was observed between 
the thermal behaviour of wild-type and heterozygous Trpm2*/~ mice (not 
shown, n= 4). e, Mean thermal preference averaged from the data between 
t= 15-30 min and t= 45-60 min shown in a-d. Error bars represent the 
mean + s.e.m. **P< 0.01; ***P < 0.001; NS, P> 0.05; unpaired f-test. 
Similar results were obtained with experiments using non-littermates in 
which 12 wild-type male mice from Charles River were compared with 

7 Trpm2~'~ male mice from homozygote breeding pairs. 


driven by TRPV1, TRPM3 and ANOI become important? ®. Wild- 
type mice also showed a less strong aversion than Trpm2~‘~ mice for 
the non-noxious cool temperature of 23°C (Fig. 3a, e). Expression of 
TRPM2 therefore causes wild-type mice to prefer cooler temperatures 
over a range of temperatures extending from 23°C to above 38°C, 
though we note that the actual temperature at the sensory nerve ending 
will be higher than the plate temperature, particularly at the lower end 
of this range, because of the influence of body temperature. 

The work presented here shows that TRPM2 accounts for a novel 
heat-sensitive mechanism in both somatosensory and autonomic neurons. 
The altered thermal preference in Trpm2~'~ mice supports the hypothesis 
that TRPM2 expressed in somatosensory neurons provides a non-noxious 
‘warm signal which drives mice to seek cooler temperatures over a wide 
temperature range, from 23°C to above 38°C. Other studies have shown 
that TRPM8 provides a ‘cool’ signal which drives warmth-seeking behav- 
iour over the range 10-25 °C}3, whereas activation of TRPV1, TRPM3 
and ANOI provide a high-temperature noxious-heat-avoidance signal*®. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 
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METHODS 


Animals. All in vitro experiments used C57BL/6 mice younger than 5 weeks old, 
except for adult DRG temperature threshold experiments (for example, Fig. 1c, d), 
in which 3-month-old adult mice were used (these mice were from the same 
group as was used for two-plate thermal preference tests experiments, see Fig. 3). 
Trpm2—'~ mice were gifts from Y. Mori, and were generated as reported 
previously’”8, Mice were maintained on a 12h day/12h night cycle. All mice 
used in two-plate thermal preference tests had been backcrossed onto the parental 
C57B16/6) strain for 7 generations, and wild-type and Trpm2~'~ mice were litter- 
mates from breeding pairs of Trpm2‘'~ heterozygote mice. 

Primary neuron cultures. PPG, SCG and paravertebral chain ganglia were 
extracted from 3 or more mice and DRG from a single mouse. Ganglia were 
incubated in papain (2 mg ml”! in Ca?+-free and Mg” -free HBSS) for 30 min 
at 30°C, followed by incubation in collagenase (2.5 mg ml“! in Ca?*-free and 
Mg?*-free HBSS) for 30 min at 37°C. Ganglia were re-suspended and mechan- 
ically dissociated in Neurobasal-A/B27 growing medium, which was prepared 
with Neurobasal-A Medium supplemented with 0.25% (v/v) L-glutamine 200 mM 
(Invitrogen), 2% (v/v) B-27 supplement (Invitrogen), 1% (v/v) penicillin-strepto- 
mycin (Invitrogen), and nerve growth factor (NGF) (Sigma-Aldrich) at 50 ng ml}. 
Dissociated neurons were centrifuged and plated onto coverslips pre-coated with 
poly- L-lysine (10 pg ml?) and laminin (40}1g ml“ !). Neurons were kept in a 37°C 
incubator with a 95% air / 5% CO, atmosphere for at least 3h before use, and all 
neurons were used within 24h. 

PC12 cell cultures. The growth medium used for PC12 cell culture was RPMI- 
1640 (Sigma-Aldrich), supplemented with: 1% (v/v) penicillin-streptomycin 
(Invitrogen), 1% (v/v) L-glutamine 200 mM (Invitrogen), 10% (v/v) horse serum 
(Invitrogen) and 5% (v/v) fetal bovine serum (FBS, Invitrogen). The differentiation 
medium for PC12 cells was RPMI-1640 supplemented with: 1% (v/v) penicillin- 
streptomycin (Invitrogen), 1% (v/v) L-glutamine-200 mM (Invitrogen), 1% (v/v) 
horse serum (Invitrogen), and NGF (Sigma-Aldrich) with final concentration at 
100ng ml~!. PC12 cells were incubated and maintained in a 37°C incubator with 
a 95% air / 5% CO, atmosphere. Medium was changed every 2 days, and cells were 
split every 3-4 days when grown to 90% confluency. The PC12 cells were seeded for 
imaging on coverslips pre-coated with poly-L-lysine (1 mg ml; Sigma-Aldrich) 
and collagen IV (1 mg ml~'; Sigma-Aldrich). PC12 cell lines were not authenticated 
and were not tested for mycoplasma contamination. 

MAH cell cultures. MAH cells were kind gifts from A. Tolkovsky and S. Birren!®. 
The growth medium used for MAH cell culture was L-15 medium (Sigma-Aldrich) 
supplemented with: 1% (v/v) penicillin-streptomycin (Invitrogen), 1% (v/v) 
L-glutamine 200 mM (Invitrogen), 10% (v/v) fetal bovine serum (FBS, Invitrogen), 
17% (v/v) NaHCO; (150mM), and dexamethasone (Sigma-Aldrich) at 51M. 
The differentiation medium for MAH cells was the same as the growth medium 
except dexamethasone was replaced with a cocktail of neurotrophic factors: CNTF 
(10ng ml~ 1; Peprotech), bFGF (10 ng ml 1; Peprotech), and NGF (50 ng ml}; 
Sigma-Aldrich). Medium was changed every 2 days and MAH cells were split every 
4 days when grown to 90% confluency and incubated and maintained in a 37°C 
incubator with a 95% air / 5% CO atmosphere. MAH cells used for imaging were 
seeded on coverslips pre-coated with poly-1-lysine (1 mg ml; Sigma-Aldrich) 
and laminin (401g ml~!; BD Science). MAH cell lines were not authenticated and 
were not tested for mycoplasma contamination. 

Extracellular solutions and perfusion system for electrophysiology. Unless 
otherwise specified, all experiments were carried out with extracellular solu- 
tion containing 140 mM NaCl, 4mM KCI, 1.8mM CaCh, 1mM MgCl, 10mM 
HEPES and 5 mM glucose; pH was adjusted to 7.4 with NaOH and osmolarity 
was between 295-305 mOsm. Sodium-free extracellular solution is prepared 
with the formulation above except for replacing sodium chloride with equimo- 
lar choline chloride. Calcium-free extracellular solution is prepared with the 
formulation above except for removal of calcium chloride. An 8-line manifold 
gravity-driven system controlled by an automated solution changer with a com- 
mon outlet was used to apply solution to the cells. The temperature in three lines 
was heated or cooled with a Peltier device regulated by a proportional gain feed- 
back controller designed by V. Vellani (CV Scientific). The temperature in each 
experimental protocol was recorded by a miniature thermocouple immediately 
before the solution entered the bath or (in separate control experiments) at the 
cell location at the tip of the solution outlet. All compounds applied were pre- 
pared as stock solutions first and then diluted to the concentration needed before 
experiments. Capsaicin was dissolved in ethanol to make 5 mM stock solutions. 
Pregnenolone sulphate was dissolved in DMSO to make 500 mM stock solutions. 
2-APB was dissolved in DMSO to make 500 mM stock solutions. The TRPV4 
agonist, PF-4674114, was dissolved in DMSO to make 5 mM stock solutions. 
Nifedipine was dissolved in DMSO to make 100 mM stock solutions. TTX was 
dissolved in pH 4.8 citrate buffer to make 100 mM stock solutions. Verapamil, 


ruthenium red, and HO) were dissolved in extracellular solution on the day of 
experiments. 

Calcium imaging. Cells were loaded with 5|1.M fura-2 AM (Invitrogen) with 
0.02% (v/v) pluronic acid (Invitrogen) for 30 min. After loading, coverslips were 
put in an imaging chamber and transferred to a Nikon Eclipse Ti-E inverted 
microscope. Cells were continuously perfused with extracellular solution and 
were illuminated with a monochromator alternating between 340 and 380 nm 
(OptoScan; Cairn Research), controlled by WinFluor 3.2 software (J. Dempster, 
University of Strathclyde, UK). Emission was collected at 510 nm and the resulting 
pairs of images were acquired every two seconds with a 100 ms exposure time 
using an iXon 897 EM-CCD camera (Andor Technology, Belfast, UK). Image time 
series were converted to TIFF files and processed with Image] software. Images 
of the background fluorescence intensity were obtained for both wavelengths and 
subtracted from the respective image stack before calculating the F340/3g0 ratio 
images. A minority of neurons (<10%) exhibited an unstable F349/3g0 baseline in 
the absence of any applied stimulus, usually caused by poor dye loading but in some 
cases apparently due to low-frequency repetitive firing even in the absence of any 
treatment, and were removed from analysis. In experiments to identify neurons 
responding to known TRP channel agonists (Fig. 1a), we found that PS caused a 
very slow increase in F349/3g9 ratio in some neurons, clearly distinguishable from 
the rapid elevation in [Ca]; seen in TRPM3-expressing DRG neurons. This slow 
response can probably be attributed to an off-target effect of PS as it was also seen 
in autonomic neurons which do not appear to express TRPM3 (Fig. 2). A positive 
response to all agonists was therefore defined from the rate of increase of [Ca]; 
following agonist application, as an increase of F34/30 ratio, between two consec- 
utive time points following application of agonist, which exceeds the mean + 3.09 
s.d. (cumulative probability value of 99.9%) of all such differences in the absence of 
any agonist. A heat-sensitive neuron is defined as a neuron with a peak increase in 
F 340/380 during a heat stimulus larger than the mean + 3.09 s.d. of the peak increase 
in F349/3g0 of the glial cells in the same experiment (see Extended Data Fig. 1b). 
The thermal threshold of a heat-responsive neuron (see Fig. 1d, e) was defined as 
the temperature when the increase in F349/3g9 ratio between two consecutive time 
points is larger than the mean + 3.09 s.d. of the increase in F349/3g0 of the glial cells 
between two consecutive time points in the same experiment. For MAH and PC12 
cell cultures, where no glial cells were present, we used the value of mean + 3.09 s.d. 
obtained from glial cells in similar experiments on neuronal cultures. 

Supplementary Information Video 1 shows an example series of calcium images. 
Patch clamp recordings. The intracellular solution for concurrent calcium imaging 
and patch clamp (see Fig. 2b) contained 140 mM KCl, 1.6 mM MgCh, 2.5mM 
MgATP, 0.5mM NaGTP, 10mM HEPES and 167 1M fura-2; pH was adjusted to 
7.3 with KOH. The intracellular solution for current-voltage relationship deter- 
mination, in which Ca** and K* currents were blocked (see Fig. 2c), contained 
130mM CsCl, 2.5mM MgATP, 0.5mM NaGTP, 10mM HEPES, 10mM TEA, and 
5mM 4-AP; pH was adjusted to 7.3 with CsOH. The osmolarity of both of the 
intracellular solutions was between 295-305 mOsm. The extracellular solution for 
current-voltage relationship determination contained 125mM NaCl, 2mM CaCh, 
10mM HEPES and 5 mM glucose, 10 mM TEA, 5mM 4-AP, 2M tetrodotoxin, 
and 100,1M CdCh. 

All patch clamp experiments were carried out with an Axopatch 200B patch- 
clamp amplifier (Axon Instrument, USA). Patch pipettes (Blaubrand 10011 borosil- 
icate glass, Scientific Laboratory Supplies, Germany) were pulled using a Flaming/ 
Brown P-97 horizontal micropipette puller (Sutter Instruments, USA) and had a 
resistance between 3 and 5.5 MQ. A giga-ohm seal was formed between the patch 
pipette and the cell membrane and the pipette capacitance transients were cancelled 
before achieving the whole cell configuration. All experiments were begun in voltage 
clamp mode with holding potential at —60 mV at the time of entering whole cell 
mode. After entering the whole-cell mode, series resistance was adjusted to be 
lower than 20 mega-ohm. Resting membrane potential was tested and only neurons 
with membrane potentials more negative than —50 mV were used for recording. 
Data were acquired and analysed with pClamp10 software (Axon Instruments, 
USA) and whole cell currents and voltages were filtered at 1 kHz and sampled at 
10kHz. 

RNA-seq. MAH cells were trypsinized and collected as a cell pellet before lysis. 
RNA extraction was performed with the miRNeasy Mini Kit (Qiagen) according 
to the manufacturer’s instructions. Two samples from MAH cells grown in growth 
medium and 2 samples from MAH cells grown in differentiation medium were sent 
to Oxford Gene Technology to complete the rest of the steps for RNA-sequencing. 
Sequencing libraries were prepared with the lumina TruSeq RNA Sample Prep 
Kit v2. A total of 4 samples (two cold-sensitive MAH cells and two cold-insensitive 
MAH cells) were sequenced on 2 lanes on the Illumina HiSeq2000 platform using 
TruSeq v3 chemistry. All sequences were paired-end and sequencing was per- 
formed over 100 cycles. Read files (Fastq) were generated from the sequencing 
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platform via the manufacturer’s proprietary software. Reads were processed 
through the Tuxedo suite. Reads were mapped to their location to the appropriate 
Illumina iGenomes build using Bowtie version 2.02. Splice junctions were identi- 
fied using TopHat, version v2.0.9. Cufflinks version 2.1.1 was used to perform tran- 
script assembly, abundance estimation and differential expression and regulation 
for the samples. Visualization of differential expression results were performed with 
CummeRbund. RNA-seq alignment metrics were generated with Picard. 

In situ hybridization following calcium imaging. Coverslips were marked on 
the periphery with a diamond knife to assist localization of the imaged region 
and were then calcium-imaged as above to identify novel heat-sensitive neurons. 
Following calcium imaging, coverslips were rinsed with phosphate-buffered saline 
(PBS) then fixed in 4% paraformaldehyde (PFA) at 4°C for 20 min. TRPM2 mRNA 
was detected with digoxigenin-labelled antisense probes against mouse Trpm2 
(NM_138301.2). We are very grateful to Y. Mori of Kyoto University for providing 
the mouse Trpm2 gene cloned into the pCI-neo plasmid (Promega). 

Probe synthesis. For the Trpm2 antisense probe the plasmid was linearized with 
EcoRI and transcribed with T3 RNA polymerase and for the Trpm2 sense probe, 
which was used as a negative control, the plasmid was linearized with SalI and 
transcribed with T7 RNA polymerase. 

Hybridization. Fixed coverslips were rinsed in PBS with 0.1% Triton, and were 
then incubated in in situ hybridization solution without probe at 47°C for 30 min 
as pre-hybridization step. After pre-hybridization, coverslips were transferred into 
in situ hybridization solution with antisense or sense probes for hybridization at 
47°C overnight”. 

Post-hybridization steps. Following hybridization coverslips were washed in 
2x SSC and then 0.2 x SSC at 47°C for 30 min for each solution. Coverslips were 
then washed twice with KTBT solution at room temperature for 5 min for each 
washing. 25% normal goat serum was then used for blocking cells for 1h at room 
temperature. Coverslips were then incubated in 25% normal goat serum containing 
pre-absorbed anti-digoxygenin antibody coupled to alkaline phosphatase for 2h 
at room temperature, followed by washing 3x in KTBT for 15 min each wash, and 
then twice in alkaline phosphatase buffer at room temperature for 10 min each 
wash. Coverslips were then developed in alkaline phosphatase buffer containing 
337.5 jg ml! NBT and 175g ml“! BCIP in the dark for 8 h before being washed in 
KTBT, fixed in 4% PFA for 10 min, washed in PBS, and then mounted in SlowFade 
Gold Antifade Mountant with DAPI”’. DIC transmitted-light images were acquired 
through a Plan Fluor 10 Ph1 DLL objective with a DS-Qi2 monochrome camera 
ona Nikon Eclipse Ti-E inverted microscope. A GFPHQ filter was used to enhance 
the dark purple colour. The images were rotated, cropped, and resized with Image] 
to be aligned with the images obtained in calcium imaging. 

Staining for isolectin B4 (IB4) following calcium imaging. DRG neurons on 
marked coverslips were calcium imaged as above then rinsed with phosphate- 
buffered saline (PBS) and fixed in 4% paraformaldehyde (PFA) at 4°C for 20 min. 
After fixation, coverslips were washed twice in PBS with 0.1% Triton then incubated 
in solution containing 10,.g ml! IB4 bound to Alexa Fluor 594, 10% normal goat 
serum, 2% bovine serum albumin, 0.1% Triton, and 10mM sodium azide for 1h 
at room temperature followed by washing with PBS three times. DIC transmitted- 
light images were acquired through a Plan Fluor 10x Ph1 DLL objective with 
a DS-Qi2 monochrome camera on a Nikon Eclipse Ti-E inverted microscope. 
A Texas Red HYQ filter was used to capture the Alexa 594 signal. The images 
obtained were rotated, cropped, and resized with Image] to be aligned with the 
images obtained in calcium imaging. 

Two-plate thermal preference tests. To eliminate as far as possible any extraneous 
genetic influences the Trpm2~/~ mice were backcrossed onto the parental 


LETTER 


C57B16/6] line for 7 generations”””*, To minimize environmental effects, wild-type 
and Trpm2-‘~ littermates from heterozygous matings were compared in behav- 
ioural experiments. Sample size to achieve significance was determined from trial 
experiments but no power analysis was performed. All mice were tested at all 
temperatures so no randomization of experimental groups was necessary (see Fig. 3 
legend). We used a two-plate thermal preference test (BioSeb, France) with one 
plate maintained at a temperature of 33°C, which other studies have shown is the 
preferred temperature'!, and the other at a variable temperature. The temperatures 
of test and control plates were reversed after 30 min to control for any influence of 
environmental cues. Other studies have observed sex differences in mouse thermal 
behaviour” so we followed other authors! in using only adult males (10-16 
weeks old) in behavioural experiments. 

Two hot/cold-plate machines (Bioseb, France), placed back to back, formed the 
two-plate thermal apparatus. Plates were enclosed in a plexiglass chamber divided 
into two lanes, with an opaque compartment between them, and two mice were 
tested simultaneously in adjacent lanes (see Supplementary Information Video 2). 
The temperature of each plate was controlled by T2CT software (Bioseb, France). 
Plate temperatures were tested with an infrared thermometer (Bioseb) and were 
found to be accurately controlled to within 0.2°C of the command temperature 
over the entire plate area. One plate was maintained at the preferred temperature 
of 33°C, and mice were initially placed onto the plate with starting temperature 
other than 33°C (‘plate A; see Fig. 3) before initiating recording. The movements of 
the mice between the two plates were recorded for 3,495 s without human presence 
and the mouse position was determined with an automated video tracking system 
(Bioseb), so operator blinding was not necessary. The temperatures of the two 
plates were exchanged 1,800s after initiation of recording; plate temperature settled 
to within 0.2°C of the new temperature within 180s of the change. Experiments 
were performed between 8 a.m. and 10 p.m., with room temperature at 20°C. For 
experiments testing thermal preference between the two mildest temperatures, 
28°C versus 33°C and 33°C versus 38 °C, mice were tested again, with the starting 
temperatures of the two plates exchanged, 3-5h after the first recording. For other 
temperatures recordings were made only once on a particular mouse. Mice were 
tested with the protocols, in order, of 28°C versus 33 °C, 33°C versus 38°C, 23°C 
versus 33°C and 33°C versus 43 °C. Sample size was based on pilot experiments. 
When making statistical comparisons variances were checked to ensure that it was 
similar between groups being compared. All animal experiments were approved by 
the Animal Welfare and Ethical Review Body (AWERB), King’s College London. 

Supplementary Information Video 2 shows an example of a thermal-choice 
behavioural experiment. 

Statistical analysis. All data are expressed as means + s.e.m. Analyses were 
performed with GraphPad Prism version 6.01 or SigmaPlot 11.0. The particular 
statistical test used is stated either in the text or figure legends. 

Biological and technical replicates. Biological replicates are stated in the legends 
for each figure. Given the nature of these experiments, technical replicates were 
not possible. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | Effect of altering the order of agonist 
application in DRG neurons a, Method for detecting novel heat-sensitive 
somatosensory neurons. Representative traces showing increases of 
[Ca?*]; (F340/380 ratio, ordinate) in DRG neurons in response to the 
TRPV1-3 agonist 2-APB (250|1M), to the specific TRPV4 agonist PF- 
4674114 (V4 agonist, 200 nM), to the specific TRPV1 agonist capsaicin 
(1M), to the TRPM3 agonist pregnenolone sulphate (PS, 100 1M), to a 
heat ramp from 35°C to 46 °C (temperature protocol shown at bottom), 
and to KCl (50mM). Other details as in Fig. 1. From top: TRPV1- 
expressing neuron responding to 2-APB, capsaicin (1 ,1M) and heat 

(red, 30% of 500 neurons); TRPM3-expressing neuron responding to 

PS (blue, 100 1M) and heat (18%); TRPV1-and TRPM3-co-expressing 
neuron responding to 2-APB, PS, capsaicin, and heat (brown, 10%); 
neuron unresponsive to TRP channel agonists but showing a response to 
heat and therefore expressing a novel heat sensitive ion channel (black, 
8% of total). No neuron responded to the specific TRPV4 agonist PF- 
4674114 (200 nM). b, Heat has a small effect on the fura-2 fluorescence 
ratio*’, so we eliminated neurons in which an increase of fluorescence 
ratio was due simply to this physical effect by comparing the increase of 
fura-2 fluorescence ratio in neurons with that in glial cells in the same 
culture. Maximum increases in F340/30 ratio in response to a heat ramp 
from 35°C to 46 °C in wild-type glial cells (black bars, top, n = 60) and in 
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wild-type DRG neurons not responding to known thermo-TRP agonists 
(black bars, bottom, from same images as the glial cells in top panel, 

n= 139). Vertical black dashed line in top panel shows mean + 3.09 s.d. 
(cumulative probability value of 99.9%) of the increase in the F349/30 
ratio in glial cells; this value is taken as the maximum increase in F349/3g0 
ratio caused by effect of heat on fura-2 and is used as the cut-off value for 
defining novel heat-sensitive neurons present in the same culture dish 
(vertical black dashed line in lower panel). Similar results from separate 
culture of Trpm2-/~ glia (n = 40) and neurons (n = 76) shown in red. The 
proportion of novel heat-sensitive neurons was significantly reduced from 
8% (41/500) in wild type to 0.4% (1/282) in Trpm2~'~ (P< 0.0001; Fisher's 
exact test). The increases in F349/3g9 ratio of novel heat-sensitive neurons 
above the cut-off value in response to heat (smallest increase = 0.019854) 
are all higher than that of the single heat-responding Trpm2~/~ neuron 
(0.019084). c, Pie charts showing the percentage of novel heat-sensitive 
neurons responding to TRP ion channel agonists and to heat in wild- 
type DRG neurons (left) and DRG neurons from Trpm2~/~ mice (right). 
Deletion of Trpm2 reduces the percentage of novel heat-sensitive neurons 
from 8% to 0.4%. Cell numbers for a~c were 500 DRG neurons from 

one wild-type mouse on 3 coverslips and 282 DRG neurons from one 
Trpm2~'~ mouse on 2 coverslips imaged. No further replicates of this 
particular experimental protocol were performed. 
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Extended Data Figure 2 | Effect of starting the heat ramp at a lower 
temperature in DRG neurons. Identical experiment to that shown in 

Fig. la—c, except that the temperature ramp started from 30°C. a, Agonist 
and heat-responsive neurons as in Fig. la. From top: TRPV 1-expressing 
neuron responding to capsaicin (111M) and heat (red, 28% of 491 
neurons); TRPM3-expressing neuron responding to PS (blue, 100 1M) and 
heat (31%); TRPV1-and TRPM3-co-expressing neuron responding to 
capsaicin, PS and heat (brown, 13%); neuron unresponsive to TRP channel 
agonists but showing a response to heat and therefore expressing a novel 
heat-sensitive ion channel (black, 7% of total). A small number of neurons 
(8%) responded to 2-APB (2501M) but not to other agonists, and 14% 

of DRG neurons did not respond to any of the agonists nor to heat (not 
shown). No neuron responded to the specific TRPV4 agonist PF-4674114 
(200 nM). b, Maximum increases in F340/3g0 ratio in response to a heat 
ramp from 30°C to 46°C in wild-type glial cells (black bars, top, n = 60) 
and in heat-sensitive wild-type DRG neurons not responding to known 
thermo-TRP agonists (black bars, bottom, from same images as the glial 
cells in top panel, n = 103). Vertical black dashed line in top panel shows 
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mean + 3.09 s.d. (cumulative probability value of 99.9%) of the increase in 
the F340/3g0 ratio in glial cells; this value is taken as the maximum increase 
in F340/3g0 ratio caused by effect of heat on fura-2 and is used as the cut-off 
value for defining novel heat-sensitive neurons present in the same culture 
dish (vertical black dashed line in lower panel). Similar results from 
separate culture of Trpm2~'~ glia (n= 60) and neurons (n=73) shown 

in red. The proportion of novel heat-sensitive neurons was significantly 
reduced from 7% (103/491) in wild type, to 2% (73/522) in Trpm2~!~ 

(P < 0.0001; Fisher’s exact test). The mean increase in F349/3g9 ratio of novel 
heat-sensitive neurons above the cut-off values in response to heat was 
also significantly reduced from 1.237 + 0.09207 in wild type (n = 36) to 
0.7959 + 0.03767 in Trpm2-!~ (n= 8) (P=0.0313; two-tailed unpaired 
t-test). c, Pie charts showing the percentage of novel heat-sensitive and 
TRPV1- or TRPM3-expressing neurons in wild-type DRG neurons (left) 
and DRG neurons from Trpm2~/~ mice (right). Cell numbers for imaging 
in a-c were 491 DRG neurons from one wild-type mouse on 3 coverslips 
and 522 DRG neurons from one Trpm2~/~ mouse on 3 coverslips. No 
further replicates of this particular experimental protocol were performed. 
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Extended Data Figure 3 | Diameters of novel heat-sensitive DRG 


neurons compared to neurons responding to other TRP agonists. 

a, Diameters of 1,324 DRG neurons taken from experiment illustrated 

in Fig. 1a (dotted line). Subpopulations of neurons are shown as follows: 
those responding to capsaicin and thus expressing TRPV1 (red); to PS and 
thus expressing TRPM3 (blue); to both agonists and thus co-expressing 
TRPV1 and TRPM3 (brown); to 2-APB alone (green); novel heat-sensitive 
neurons (orange), and neurons responding neither to heat nor to any 
of these agonists (black). b, Diameter comparison of subpopulations of 


neurons. TRPV 1-expressing neurons have the smallest mean diameter 
(18.58 + 0.17 1m), TRPM3-expressing neurons are intermediate 
(21.754 


t 0.33 1m), and neurons expressing only the novel heat-sensitivity 
have the largest mean diameter (25.47 + 0.48 1m). Significance from 


one-way ANOVA and multiple comparisons with Tukey’s multiple 
comparison test (****P < 0.0001; NS, not significant). Data obtained from 
Fig. la—c; 1,324 DRG neurons from one wild-type mouse on 4 coverslips 


were analysed. 
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Extended Data Figure 4 | The novel heat-sensitivity in DRG neurons 
is partially co-expressed with TRPV1 and TRPM3, and is enhanced 
by H20>. a, Temperature ramp to 47 °C, as in Fig. la, but with TRPV1 
blocked with AMG9810 (541M) and TRPM3 blocked with naringenin 
(104M). Criterion level for significant increase (dashed lines) taken 
from glial cells in same field of view (data not shown). Black bars: 

46% of all wild-type DRG neurons (n= 580) responded to heat ramp 
from 34°C to 46°C with an increase in [Ca**]; above criterion level in 
presence of blockers of TRPV1 and TRPM3 (dashed vertical line), while 
the percentage decreased to 17% in Trpm2~'~ DRG neurons (red bars, 
n= 1,007) (P< 0.0001; Fisher’s exact test). The mean increase in F349/3g0 
ratio above the cut-off values (dashed lines) in response to heat was also 
significantly reduced from 1.619 + 0.06133 in wild type (n= 265) to 
1.027 + 0.08394 in Trpm2-'~ (n=175) (P< 0.0001; two-tailed unpaired 
t-test). In similar experiments with TRPV1 blocker BCTC (441M) and 
naringenin (101M), 37% of wild-type neurons responded to heat (data 
not shown, 1 = 554). b, Similar plot as in a, but data from the subgroup 
of novel heat-sensitive neurons not responding to agonists for known 
thermo-TRP channels. After exposure to heat in presence of blockers of 
TRPV1 and TRPM3, blockers were removed and neurons not responding 
to known TRP agonists were identified as in Fig. la. Data from same 
experiment as shown in a. Proportion of neurons expressing the novel 
heat-sensitive mechanism in isolation (that is, without co-expression 

of TRPV1 or TRPM3) was significantly lower (52/580, 9%) than all 
neurons expressing the novel heat-sensitive mechanism (46%, see a). 
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The proportion of novel heat-sensitive neurons was significantly reduced 
in Trpm2-/~ mice, from 9% to 0.6% (6/1,007, P< 0.0001; Fisher’s exact 
test). c, Temperature ramp to 42°C. Few novel heat-sensitive neurons 
respond to this low temperature in either wild type or Trpm2~'~. The total 
number of neurons was n= 173 for wild type and n=211 for Trpm2-/~. 
d, Responses of the same neurons to the same temperature ramp to 

42°C following addition of H,O (400|1M). Enhancement of response 

in wild type (black bars) was largely abolished in Trpm2~'~ (red bars). 
The proportion of novel heat-sensitive neurons after sensitization with 
HO) was significantly reduced from 11% (74/635) in wild type to 8% 
(48/601) in Trpm2-'— (P = 0.0356; Fisher’s exact test). The mean increase 
in F349/3g9 ratio of novel heat-sensitive neurons above the cut-off values 
in response to heat was also significantly reduced from 1.175 0.1516 in 
wild type (1 = 72) to 0.4485 + 0.04329 in Trpm2-/~ (n= 48) (P=0.0002; 
two-tailed unpaired t-test). Cell numbers and replicates for a and b were 
580 DRG neurons from one wild-type mouse on 5 coverslips and 1,007 
DRG neurons from one Trpm2~'~ mouse on 5 coverslips imaged for 

the protocol with AMG9810. 554 neurons from one wild-type mouse 

on 4 coverslips were imaged for the protocol with BCTC. No further 
replicates were carried out. The cell numbers and replicates for c and d 
were 635 DRG neurons from one wild-type mouse on 3 coverslips and 
601 DRG neurons from one knockout mouse on 3 coverslips imaged. The 
experiment was replicated with similar results on 4 additional coverslips 
from one mouse. 
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Extended Data Figure 5 | Most novel heat-sensitive DRG neurons are 
IB4-positive. a, Increases in [Ca”*]; (F340/380 ratio image, intensity-coded 
in red, in mouse DRG neurons) in response to heat ramp to 46°C (TRPV1 
blocked with AMG9810, 541M, and TRPM3 blocked with naringenin, 101M), 
superimposed on differential interference contrast (DIC) transmission 
image obtained post-fixation. b, The same field following fixation and 
labelling with fluorescent IB4 (green). c, Superimposed calcium and IB4 
images from a and b. Black arrows show neurons responding to heat and 
positive for IB4. White arrow indicates a neuron responding to heat and 
negative for IB4. Black arrowhead shows neuron insensitive to heat and 
positive for IB4. Scale bars 501m. d, Diameter histogram of 743 fixed DRG 
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neurons subgrouped according to novel heat-sensitivity (yellow, red) and 
IB4 binding (yellow, green). 25% (184/743) of DRG neurons showed novel 
heat-sensitivity, and 74% of these novel heat-sensitive neurons were IB4- 
positive, whereas only 53% of heat-insensitive neurons were IB4-positive. 
The percentage of IB4-positive neurons is significantly higher in the heat- 
sensitive group than in the heat-insensitive group (P < 0.0001; Fisher's 
exact test). The diameters shown in d are not directly comparable with the 
live cell diameters shown in Extended Data Fig. 3 because of a shrinkage 
artefact on fixation. Cell numbers were 743 DRG neurons from one wild- 
type mouse on 4 coverslips imaged. No further replicates were performed. 
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Extended Data Figure 6 | See next page for caption. 
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Extended Data Figure 6 | Properties of novel heat-sensitive ion channel 
expressed in autonomic neurons. a, Representative traces showing 
increases of [Ca**]; (F340/3g0 ratio, ordinate) in sympathetic neurons from 
superior cervical ganglion (SCG) in response to a mixture of capsaicin 
(TRPV1 agonist,1|1M) and pregnenolone sulphate (TRPM3 agonist, 
100M); a mixture of 2-APB (TRPV1-3 agonist, 250 1M) and PF- 
4674114 (TRPV4 agonist, 200 nM); heat to 47 °C (temperature protocol 
shown at bottom); and KCl] (50 mM). Similar results were obtained with 
parasympathetic neurons from pterygopalatine ganglion (PPG, data not 
shown). Trace is same as shown in Fig. 2a. b, Similar histograms as in 
Extended Data Figs 1b and 2b, but for SCG glial cells and neurons from 
wild-type mice (black bars) and Trpm2~/~ mice (red bars). 58% of wild- 
type SCG neurons (n = 436) showed novel heat-sensitivity with increases 
in F34/3g0 ratio above the criterion level obtained from glial cells (n = 80) 
in same culture (black vertical dashed line). In similar experiments on 
PPG neurons (n = 484), 47% showed novel heat-sensitivity (not shown). 
Red bars and red dashed line show results from SCG glia (n = 80) and 
neurons (n= 430) from Trpm2~/~ mice. The proportion of novel heat- 
sensitive neurons was significantly reduced by deletion of Trpm2, from 
58% (252/436) in wild type to 12% (53/430) in Trpm2~'~ (P< 0.0001; 
Fisher's exact test). The mean increase in F349/3g9 ratio of heat-sensitive 
neurons above the cut-off values in response to heat was also significantly 
reduced, from 1.629 + 0.1928 in wild type (n = 252) to 0.5050 + 0.1270 
in Trpm2~'~ (n=53) (P=0.0086; two-tailed unpaired t-test). c, Heat- 
evoked Ca?" increase in SCG neurons is reduced but not abolished by 
removal of extracellular Na* (replaced with choline) and is abolished by 
removal of extracellular Ca2*+ (remaining small increase in F349/3g0 ratio 
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is due to temperature sensitivity of fura-2, see b). Similar results were 

seen in 133 SCG neurons. d, Heat-evoked Ca?* increase in SCG neurons 
is blocked by TRPV agonist 2-APB (251M). Similar results were seen in 
130 SCG neurons. e, The Ca?* increase in PPG neurons is not affected by 
TRPV channel blocker ruthenium red (501M). Similar results were seen 
in 75 PPG neurons. f, The Ca?* increase in PPG neurons is not affected 
by the Na channel blocker tetrodotoxin (2 1M). Similar results were seen 
in 35 PPG neurons. g, The Ca?* influx in PPG neurons is reduced but 

not eliminated by the L-type Ca”* channel blocker verapamil (100 |1M). 
Similar results were seen in 30 PPG neurons. Cell numbers and replicates 
for a were 166 SCG neurons from three wild-type mice on 3 coverslips 
imaged. Cell numbers and replicates for b were 436 SCG neurons from two 
wild-type mice on 4 coverslips and 430 SCG neurons from two Trpm2~'~ 
mice on 4 coverslips imaged. Similar results as those shown for wild-type 
obtained with 15 further coverslips of SCG neurons from 9 wild-type mice 
and 7 coverslips of PPG neurons from 6 wild-type mice. Cell numbers 
and replicates for ¢ were 133 SCG neurons from 3 wild-type mice on 

5 coverslips that showed similar responses. Cell numbers and replicates 
for d were 130 SCG neurons from 3 wild-type mice on 2 coverslips hat 
showed similar responses. Similar results were also obtained for DRG 
neurons (4 coverslips from 1 mouse). Cell numbers and replicates for 

e were 75 PPG neurons from 3 wild-type mice on 4 coverslips that showed 
similar responses. Cell numbers and replicates for f were 35 PPG neurons 
from 3 wild-type mice on 4 coverslips that showed similar responses. Cell 
numbers and replicates for g were 30 PPG neurons from 3 wild-type mice 
on 2 coverslips that showed similar responses. 
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Extended Data Figure 7 | The heat-induced membrane current in that obtained with reverse voltage ramp (see Fig. 2c). Trace obtained by 
autonomic neurons is not gated by membrane voltage. Current—voltage subtracting current-voltage relations at 36 °C from that at 47 °C. Similar 
difference relations of a PPG neuron with a voltage ramp starting from a results were obtained in 3 cells on 3 coverslips. 


negative potential (inset) show a similar linear heat-induced current to 
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Extended Data Figure 8 | Responses to heat in adrenal-derived MAH 
and PC12 cell lines, and effects of factors causing differentiation to 
neuronal phenotype. a, MAH cells. Black indicates maximum increase 
in F349/3g0 ratio in response to heat (47 °C, n = 254) when cultured in 
dexamethasone (51M). No cell responded to TRP agonists, but 27% of 
cells responded to heat with increase above mean criterion level obtained 
from glial cells in neuronal cultures (see Fig. 1b). Red indicates similar 
histogram after 12 days of culture in growth factors (bFGF, CNTF and 
NGE, see Methods, n = 170). No cell responded to TRP agonists; 9% 
responded to heat. The proportion of heat-sensitive cells was significantly 
reduced from 27% (69/254) in dexamethasone to 9% (16/170) in growth 
factors (P < 0.0001; Fisher’s exact test). The 66% reduction in the 
proportion of heat-sensitive cells was not significantly different from the 
reduction in Trpm2 expression caused by differentiation of MAH cells 
(Table 1; P= 0.056; two-tailed unpaired t-test). The mean increases in 

F 340/380 ratio above the cut-off values (dashed lines) in response to heat 
were 1.755 +0.1255 in dexamethasone (m = 69) and 1.420 + 0.1474 in 
presence of growth factors (n= 16) (P= 0.2203; two-tailed unpaired 
t-test). b, PC12 cells. Black indicates culture in growth medium 
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(10% horse serum plus 5% fetal bovine serum, n= 200). 93% of cells 
responded to heat with increase above mean criterion level obtained 

from glial cells in neuronal cultures (see Fig. 1b). Red indicates effect 

on heat responses of 12 days of culture in NGF (1% horse serum plus 
100ng ml! NGF, n= 108). The proportion of heat-sensitive cells 

was significantly reduced from 93% (186/200) in growth medium to 

46% (50/108) in NGF (P< 0.0001; Fisher’s exact test). We note that a 
significantly lower expression of mRNA for TRPM2 in differentiated PC12 
cells has been reported*!. The mean increase in F 349/39 ratio above the 
cut-off values (dashed lines) in response to heat was significantly reduced 
from 3.753 + 0.2431 in growth medium (n = 186) to 2.603 + 0.3104 in 
NGF (n=50) (P= 0.0213; two-tailed unpaired t-test). c, Temperature 
thresholds of PC12 cells cultured in growth medium. Top left, temperature 
protocol. Bottom left, temperature responses of three representative cells. 
Right, temperature thresholds calculated as in Fig. 1d. Cell numbers and 
replicates for a and b were 2 coverslips for each condition imaged. Cell 
numbers are given above. Replicates of 4 coverslips (MAH cells) and 

3 coverslips (PC12 cells) for each condition gave similar results. Cell 
numbers and replicates for c were 165 PC12 cells on one coverslip imaged. 
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Extended Data Figure 9 | Effect of deletion of TRPM2 on maximal 
calcium responses to heat in neurons expressing TRPV1 or TRPM3. 
a, Maximum increases in F 349/39 ratio in response to a heat ramp 

from 34°C to 46°C in neurons responding only to capsaicin (TRPV1- 
expressing) from wild-type (black) and Trpm2~'~ (red) mice. The 
increase in F349/30 ratio in response to heat (above the increase caused 
by the effect of temperature on fura-2, vertical dotted lines, for method 
of calculation see Fig. 1b) is not significantly different between wild type 
and Trpm2~'~ (P=0.1168, two-tailed Mann-Whitney U-test). Details 
as in Fig. 1. b, Neurons responding only to pregnenolone sulphate (PS, 
TRPM3-expressing) from the same experiments as in a. The increases 


in F340/3g0 ratio in response to heat are significantly reduced by deletion 
of Trpm2 (from 6.389 + 1.225 to 4.411 + 1.582, P< 0.0001, two-tailed 
Mann-Whitney U-test). c, Neurons responding to both capsaicin and 

PS (TRPV1- and TRPM3-expressing). The increases in F34o/3g0 ratio in 
response to heat are not significantly different between wild type and 
Trpm2—'~ (P= 0.0633; two-tailed Mann-Whitney U-test). Cell numbers 
and replicates for a-c were 1324 DRG neurons from one wild-type mouse 
on 4 coverslips and 981 DRG from one Trpm2~/~ mouse on 4 coverslips 
imaged. Similar results obtained from 42 additional coverslips from 

6 wild-type mice and 8 additional coverslips from one Trpm2~'~ mouse. 
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Extended Data Figure 10 | See next page for caption. 
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Extended Data Figure 10 | Correlation between novel heat sensitivity 
and expression of mRNA for TRPM2. a, Representative DIC transmission 
image of DRG neurons following in situ hybridization with the sense probe 
as negative control. Non-specific density was linearly dependent on cell 
diameter (see e). Mean + 3.09 s.d. (cumulative probability value of 99.9%) 
of density as a function of diameter with sense probe (51m bins) was 

used as threshold criterion for significant expression of Trpm2 in images 
of antisense hybridization. A similar analysis of non-specific density was 
carried out for glial cells. b, Representative DIC transmission image with 
antisense probe against Trpm2. Using the threshold criterion as function 
of diameter obtained from sense probe images (see a), 89% (1,121/1,250) 
of DRG neurons but 3% (4/120) of glial cells were positive for TRPM2 
mRNA. c, Novel heat-sensitive DRG neurons determined using calcium 
imaging. Increases in [Ca?*]; (F340/380 ratio image, intensity-coded in red) 
in response to a heat ramp to 46°C with TRPV1 blocked with AMG9810 
(51M) and TRPM3 blocked with naringenin (101M). d, Superimposed 
image of novel heat-sensitive neurons (red) and in situ hybridization using 
antisense probe. Solid red arrows indicate novel heat-sensitive neurons 
also positive for TRPM2; solid black arrow shows neuron not responding 


to heat but positive for TRPM2; open black arrow shows cell negative for 
TRPM2 (that is, with density below the criterion level obtained from e). 
42% (92/218) of DRG neurons positive for TRPM2 exhibited novel 
heat-sensitivity. However only 13% (2/16) of DRG neurons negative for 
TRPM2 from in situ hybridization exhibited novel heat-sensitivity. The 
percentage of novel heat-sensitive DRG neurons is significantly reduced 
in TRPM2 negative DRG neurons (P= 0.0188; Fisher’s exact test). Scale 
bars, 50j1m. e, Density of non-specific label in neurons obtained from 
hybridization with sense probe (see a) depends on cell size. Data used 

to calculate significance thresholds for neurons in b. Cell numbers and 
replicates for a was one coverslip exposed to sense probe used as negative 
control. Cell numbers and replicates for b were all neurons on 5 coverslips 
measured and one coverslip was measured for TRPM2-positive glial cells. 
Cell numbers and replicates for c and d was one coverslip was analysed 

as in a and b for the combined calcium imaging and in situ hybridization 
protocol. Cell numbers and replicates for e were 500 DRG neurons on one 
coverslip exposed to the sense probe used to determine the background 
threshold as a function of cell diameter. Similar in situ hybridization 
results were obtained on 16 additional coverslips. 
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The prion protein is an agonistic ligand of the 
G protein-coupled receptor Adgrg6 


Alexander Kiiffer*, Asvin K. K. Lakkaraju!*, Amit Mogha’, Sarah C. Petersen’, Kristina Airich', Cédric Doucerain!, 
Rajlakshmi Marpakwar', Pamela Bakirci', Assunta Senatore!, Arnaud Monnard!, Carmen Schiavi', Mario Nuvolone', 
Bianka Grosshans*, Simone Hornemann!, Frederic Bassilana?’, Kelly R. Monk? & Adriano Aguzzi! 


Ablation of the cellular prion protein PrP© leads to a chronic 
demyelinating polyneuropathy affecting Schwann cells. Neuron- 
restricted expression of PrP prevents the disease!, suggesting that 
PrP“ acts in trans through an unidentified Schwann cell receptor. 
Here we show that the cAMP concentration in sciatic nerves from 
PrP°-deficient mice is reduced, suggesting that PrP© acts via a 
G protein-coupled receptor (GPCR). The amino-terminal flexible 
tail (residues 23-120) of PrP“ triggered a concentration-dependent 
increase in cAMP in primary Schwann cells, in the Schwann cell line 
SW10, and in HEK293T cells overexpressing the GPCR Adgrg6 (also 
known as Gpr126). By contrast, naive HEK293T cells and HEK293T 
cells expressing several other GPCRs did not react to the flexible tail, 
and ablation of Gpr126 from SW10 cells abolished the flexible tail- 
induced cAMP response. The flexible tail contains a polycationic 
cluster (KKRPKPG) similar to the GPRGKPG motif of the 
Gpr126 agonist type-IV collagen”. A KKRPKPG-containing PrP©- 
derived peptide (FT 23-50) sufficed to induce a Gpr126-dependent 
cAMP response in cells and mice, and improved myelination in 
hypomorphic gpr126 mutant zebrafish (Danio rerio). Substitution 
of the cationic residues with alanines abolished the biological 
activity of both FT 23-59 and the equivalent type-IV collagen peptide. 
We conclude that PrP© promotes myelin homeostasis through 
flexible tail-mediated Gpr126 agonism. As well as clarifying the 
physiological role of PrP‘, these observations are relevant to the 
pathogenesis of demyelinating polyneuropathies—common 
debilitating diseases for which there are limited therapeutic options. 

Neuronal ablation of Prnp triggers chronic demyelinating polyneu- 
ropathy (CDP)!, suggesting that Schwann cells bear a PrP© receptor. 
We therefore assessed the binding of full-length PrP© (recPrP, residues 
23-231), the flexible tail (FT; residues 23-110), or its refolded globular 
domain (GD; residues 121-231), to primary Schwann cell (PSC) 
cultures from Prnp“"/2"! sciatic nerves* using POM1 and POM2 
antibodies against epitopes of PrP (ref. 4). Both recPrP and FT, but 
not GD, were found to stain PSCs (Extended Data Fig. 1a). 

Using transcription activator-like effector nucleases, we generated 
a Prnp-ablated subclone (termed SW10,p;p) of the SW10 Schwann 
cell line (Extended Data Fig. 1b, c). Again, recPrP and FT, but not 
GD, adhered to SW10,p;p cells (Fig. 1a and Extended Data Fig. 1d, e). 
Neither recPrP nor FT adhered to the Prnp~'~ hippocampal cell line 
HpL? (Extended Data Fig. 1f), suggesting that binding was specific to 
Schwann cells. We next measured the binding of synthetic PrP-derived 
peptides (21M, 20 min) to SW10,p;p cells (Fig. 1b). FT'23-s0, but neither 
FT39_66 nor any of the C-proximal peptides (Extended Data Table 1), 
showed binding to SW10,p;p cells (Fig. 1c), suggesting that residues 
23-38 are essential for this interaction. 

We treated SW10 cells with trypsin (2.5% w/v, 10 min) to degrade 
membrane-resident proteins. Following trypsin inactivation, we added 


non-trypsinized SW10 cells labelled with the cell-tracking dye Deep 
Red. Haemagglutinin (HA)-tagged FT'23_59 (2 4M) was added to the 
cells, and binding was monitored with anti-HA antibodies. FT 23_59 
bound to 51% of the non-trypsinized cells and 5% of the trypsinized 
cells (Extended Data Fig. 1g), suggesting that there is a surface FT 
receptor on Schwann cells. Cytofluorimetry revealed no altered 
binding of HA-tagged FT 3_59 to SW 10 cells treated with phosphoinosi- 
tol phospholipase C (PI-PLC, 30 min at 37 °C), indicating that binding 
did not require glycophosphatidylinositol (GPI)-anchored surface 
proteins, whereas binding of POM2 was greatly reduced, indicating 
that PrP© had been stripped from cell surfaces (Extended Data Fig. 1h). 

Might the FT alter cAMP signalling in Schwann cells? Sciatic 
nerves taken from Prnp“"/2"! and wild-type mice at 4 days of age 
showed similar cAMP levels (Extended Data Fig. 2a). Sciatic nerves 
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Figure 1 | Schwann cells selectively bind the FT of PrP“. a, Prnp-ablated 
SW 10 cells (SW10,prp) were exposed to recombinant PrP° (recPrP), 

FT, or GD (ligand). PrP© and FT, but not GD, adhered to Schwann cells 
(red: POM1 and POM? antibodies; grey: DAPI). p75NF® antibodies 
identified Schwann cells. Scale bars, 26 jm. b, FT-derived peptides and 
PrP© domains. CCl and CC2, charge clusters 1 and 2; OR, octapeptide 
repeats. Peptides are colour-coded as in c. c, SW10,prp cells were exposed 
to FT-derived peptides (21M, 20 min) carrying a C-terminal HA tag. Flow 
cytometry showed strong binding by peptide FT23-50. d, Sciatic nerves 
from Prnp“#1/24! mice displayed lower cAMP than those from wild-type 
BL6 mice. Dots represent individual mice (11-15 mice per group). Error 
bars show s.e.m. Unpaired Student's t-test was used for statistical analysis. 
Data (a, c) are representative of three biological replicates. 
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Figure 2 | The FT fragment elicits a concentration-dependent cAMP 
response. a, Primary Prnp“""/7"! Schwann cells were treated (20 min) 
with increasing concentrations of recombinant FT or with 101M GD. 
O, untreated cells. cAMP levels were determined in cell lysates 
(5 x 10° cells per assay). Addition of FT, but not of GD, induced a 
concentration-dependent cAMP response in Schwann cells. NS, not 
significant; **P < 0.01; ***P< 0.001. b, cAMP concentrations in primary 
Prnp“"/2 Schwann cell cultures exposed to medium conditioned by 
HEK293T cells overexpressing wild-type murine PrP© (HEK"®, right) 
or a non-coding vector (HEK*”?", left) as control. FT-containing medium 
resulted in cAMP induction. ¢, Synthetic peptides (27-44 residues) were 
added to SW10,pyp cells (2 1M, 20 min). Only FT23-59 induced cAMP. 
d, FT 23-59 was preincubated with a twofold molar excess of miniantibodies 
Fab3 or Fab71, and added to SW10,p;p cells (20 min). Preincubation 
with Fab3, but not with Fab71, significantly quenched the FT-dependent 
cAMP spike. Error bars show s.e.m. Panels depict independent triplicates; 
unpaired Student's t-test was used for statistical analysis. 
taken from 4-week-old Prnp“"!/24! mice showed a trend towards 
decreased cAMP levels (Extended Data Fig. 2b). When taken from 
Prnp7/ZH1 mice at 12-16 weeks, when CDP is incipient’, sciatic 
nerves exhibited significantly lower cAMP levels than those taken from 
wild-type mice (Fig. 1d; P=0.0115). Sciatic nerve lysates from strictly 
isogenic C57BL/6] Prnp@#9! 2H3 mice® (10-16 weeks old) also displayed 
significantly lower levels of cAMP than did those from wild-type mice 
(Extended Data Fig. 2c). 

Sciatic nerves from 12-16-week-old tgNSE-PrP mice, which express 
PrP© in neurons, showed cAMP levels similar to those of wild-type 
mice, whereas sciatic nerves from fgMBP-PrP mice, which express PrP 
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in Schwann cells and suffer from CDP, showed lower cAMP levels 
(Extended Data Fig. 2d). Moreover, when treated with FT (0.1-5 uM, 
20min), Prnp“"/2 PSC and SW10 cells displayed a concentration- 
dependent increase in cAMP (Fig. 2a and Extended Data Fig. 2e) with 
an ECs of 860 nM (Extended Data Fig. 2f). 

We transfected HEK293T cells, which express little endogenous 
PrP°, with a plasmid expressing murine PrP® or with an empty plasmid 
(HEK™”? or HEK*"P, respectively). Immunoprecipitation showed 
that HEK"*” cells released soluble FT (approximately 37 ng FT per 
ml; Extended Data Fig. 3a, b). Exposure to conditioned medium from 
HEK?"? cells, but not from HEK*”?" cells, raised cAMP levels in PSC 
cultures generated from Prnp“"!/24! mice (Fig. 2b). Spent medium 
from PSC cultures and sciatic nerves from 10-week-old C57BL/6 mice 
did not contain any FT, suggesting that little FT is released by Schwann 
cells (Extended Data Fig. 3c, d). 

To find out whether the FT contains motifs responsible for inducing 
the cAMP response, we treated SW 10,p;p cells with the same synthetic 
peptides used previously to assess binding to Schwann cells. Peptide 
FT3_59 induced a cAMP response, whereas FT 34-9 and all C-proximal 
peptides were inactive (Fig. 2c). These results suggest that residues 
23-33, which contain the lysine-rich charge cluster 1 (CC1), represent 
the biologically active region of the FT. To test this prediction, we 
incubated FT 3_59 with monovalent recombinant phage-derived 
miniantibodies recognizing the CC1 (Fab3) or octapeptide repeats 
(Fab71) of PrP. Preincubation with Fab3, but not with Fab71, 
significantly quenched the ability of FT3_59 to induce an increase in 
cAMP in SW10,p;p cells (Fig. 2d). 

Peripheral nervous system myelination is controlled by Gpr126, an 
adhesion GPCR expressed by Schwann cells”*. We therefore generated 
HEK293T cells stably overexpressing human Gpr126 or, as controls, 
Gpr124 or Gpr176 (denoted HEK@P'!26, HEKGPr!24: and HEKGP!!76, 
respectively) with a C-terminal V5 epitope tag’ (Extended Data Fig. 3e). 
Flow cytometry revealed that FT'23-s59 bound to HEK®?"!”6 cells but not 
to HEK°"!74 or HEKGP"!”6 cells (Extended Data Fig. 3f, g). Moreover, 
exposure to FT33_59 (20 min, >500nM) increased cAMP in HEK@?"!?6 
cells but not in HEK©?!*4 or HEKOP"!® cells (Fig. 3a). We then treated 
HEK293T, HEK@?P"!4, and HEK©?"!6 cells with recombinant FT or GD 
(20 min, 241M). FT was detected only in lysates from HEK©?"”* cells 
immunoprecipitated with anti-V5 antibodies, and GD was not detected 
in lysates from any of the cells (Extended Data Fig. 4a). 

Next, we generated clonal SW10 cells devoid of endogenous Gpr126 
(designated SW10,cpri26). When treated with FT 23-59, SW 10 (but not 
SW10,cpri26) cells reacted with increased cAMP levels. A modified 
FT)5 4 peptide (with lysine residues replaced with alanines) was 
ineffective in binding cells and inducing cAMP (Fig. 3b). We then 
treated SW10acpriz6 cells transfected with human Gpr126, Gpr124, 
Gpr176 or Gpr56 with FT 3-59. Only Gpr126-transfected cells showed 
a cAMP response (Extended Data Fig. 4b) similar to that of naive SW10 
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Figure 3 | FT-dependent cAMP signalling in Gpr126-ablated Schwann 
cells. a, Intracellular cAMP in wild-type HEK293T cells and in Gpr176-, 
Gpr124- and Gpr126-overexpressing cells exposed to FT 23-59 (0.5 1M, 

20 min). Only HEK®?"!”* cells showed an increase in cAMP. b, Wild-type 
(left) and Gpr126-ablated (right) SW10 cells were exposed to FT 23-59 
(21M, 20 min). SW10 cells, but not SW10,cpri26 cells, responded to 

FT 3-59 with a cAMP spike. Moreover, SW10 cells did not respond to 


sw O,epriz6 


alanine-substituted FT73_59 (FTS,"4,). c, Protein was isolated from 


wild-type or Prnp2#9/245 sciatic nerves (13-week-old female mice) and 
western blots were probed for Egr2 and actin. Densitometry (below) 
showed reduced Egr2 in Prnp“#*/243 nerves (P= 0.028). For uncropped 
gels see Supplementary Information File 1. Error bars show s.e.m. Panels 
depict independent triplicates; unpaired Student's t-test was used for 
analysis. **P < 0.01. 
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Figure 4 | FT and collagen-IV share a cAMP-inducing domain. 

a, Sequence alignment revealed two regions of similarity between the 

FT and Col4 (red boxes). Yellow and green shades represent high and 
moderate similarity, respectively. Dotted line, non-homologous residues; 
asterisks, identical residues. b, SW10,p;p cells were treated with synthetic 
FT23_59 or modified version of FT23_59 in which the KKRPK or QGSPG 
motifs were replaced with alanines (2|1.M, 20 min). Alanine substitution of 
KKRPK (peptide FT3; 2), but not of QGSPG, abrogated cAMP induction. 
c, SW10aprp and SW10AGpri26 cells were exposed (2 1M, 20 min) to the 
synthetic peptides FT 33-59, FT'23_34 or SFT 73-34 (peptide containing 
scrambled amino acid sequence of FT 23-34). FT 23-59 and FT23_34 induced 
cAMP in SW10,p;p cells but not SW10Acpriz6 Cells. Error bars show s.e.m. 
Panels depict independent triplicates; unpaired Student's t-test was used 
for analysis. **P < 0.01; ***P<0.001. 


cells, indicating that the tag did not affect the function of Gpr126 
(Extended Data Fig. 4c). When treated with conditioned medium from 
HEK®? or HEK*”PY’ cells, SW10 but not SW10 AGpriz6 cells responded 
with a cAMP spike (Extended Data Fig. 4d). Moreover, FT adsorption 
was reduced in SW10Acpriz6 Cells (Extended Data Fig. 4e). 

We then administered FT 23-59 (20 min, 21M) to HEK®?"!6 cells and 
HEK293(H) cells (a clonal variant of HEK293T cells with superior 
growth and transfection efficiencies) transfected with plasmids 
encoding human Gpr56, Gpr64, Gpr133, or Gpr97. Only Gpr126- 
expressing cells showed a cAMP response (Extended Data Fig. 4f). The 
magnitude of the cAMP response was not enhanced by increasing the 
amount of transfected plasmid, suggesting that other signalling 
components became limiting (Extended Data Fig. 5a). There was no 
cAMP induction in Prnp“""/“" cerebellar granule neuronal cultures 
treated with FT 3_s9 or Fr, (Extended Data Fig. 5b), as expected 
from the minimal Gpr126 expression in the brain!”. The FT is released 
from PrP© by metalloproteases ll, after treatment with the 
metalloprotease inhibitor TAPI-2, HEK?"?_conditioned medium 
contained significantly less FT than untreated medium (Extended Data 
Fig. 5c, d) and displayed reduced cAMP-inducing activity (Extended 
Data Fig. 5c). 

Egr2 (also known as Krox-20) controls the expression of myelin genes 
and has been implicated in myelin maintenance’. Egr2 expression 
was decreased in sciatic nerves from 13-week-old Prnp“9/2"3 mice 
(P< 0.05; Fig. 3c), and recombinant FT (2M, 24h) activated Egr2- 
dependent luciferase expression in SW 10 cells (Extended Data Fig. 5e). 
Similarly, Egr2 transcription was upregulated in primary Schwann cells 
treated with recombinant FT (241M, 1h) (Extended Data Fig. 5f). Also, 
Akt phosphorylation increased 5 min after treatment with recombinant 
FT (241M) and peaked at 10 min in SW10,p;p but not SW10acpriz6 
cells (Extended Data Fig. 5g). The integrity of SW10 cells and their 
subclones was confirmed by the expression of myelin genes (Extended 
Data Fig. 6a). 

We identified two regions of similarity between FT (KKRPKPG and 
QGSPG) and the Gpr126 ligand, type-IV collagen (Col4)? (GPRGKPG 
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and QGSPG; Fig. 4a). Replacement of the conserved cationic residues 
with alanines (KKRPKPG — AAAPAPG), but not other substitutions, 
abrogated cAMP induction in SW10,p;p cells (Fig. 4b); treatment with 
FT 3-34 (21M, 20 min), which contains KKRPKPG, sufficed to induce 
cAMP in SW10aprp but not in SW10aGpriz6 cells (Fig. 4c). We next 
generated murine PrP© mutants containing alanine substitutions in 
either of the two conserved motifs. After transient transfection, both 
mutants were highly expressed by HEK293T cells (Extended Data 
Fig. 6b), and cleaved FT was recovered in the medium (Extended Data 
Fig. 6c). When applied to SW10,p;p cells, conditioned medium from 
HEK293T cells expressing wild-type or QGSPG-mutated PrP© 
(HEKS*Gcspc) induced cAMP, whereas medium from cells expressing 


KKRPK-mutated PrP© (HEKSxrex) did not (Extended Data Fig. 6d). 
We then generated 21-mer peptides bearing the corresponding Col4 
sequence (GPRGKPG) or an alanine-substituted variant (AAAGAAG). 
The native Col4 peptide (8|1M), but not the mutated peptide, induced 
cAMP in SW10,p;p cells (Extended Data Fig. 6e). 

In zebrafish, Gpr126 controls myelination; however, Prnp~! ~ mice 
have no obvious myelination defects but develop a late-onset peripheral 
neuropathy with demyelination, onion bulb formation (Fig. 5a and 
Extended Data Fig. 7a), decreased Remak bundles containing 10 
or more axons and increased bundles containing fewer than 10 axons 
(Extended Data Fig. 7b), indicative of impaired axon-Schwann cell 
interactions. To test whether Gpr126 dysfunction could cause late-onset 
phenotypes in mice, we examined sciatic nerves from Dhh"::Gpr126 
mice, in which Gpr126 is specifically deleted in Schwann cells from 
approximately embryonic day (E)12.5 onwards!*!*, At one year of 
age, these mice showed neuropathic traits similar to those of Prnp-'~ 
mice including reduced bundles containing >20 unmyelinated axons, 
increased bundles containing <10 axons, abnormal cytoplasmic 
Schwann cell protrusions’ and onion bulbs (Fig. 5b and Extended 
Data Fig. 7c, d). 

The gpr126"® zebrafish mutant has a point mutation that reduces 
Gpr126 signalling and shows decreased myelin basic protein (Mbp) 
expression by Schwann cells of the posterior lateral line nerve (PLLn) 
(Fig. 5c and Extended Data Fig. 8a), which can be rescued by Gpr126 
activators®'>!6, When applied to gpr126"® zebrafish larvae at 50-55h 
post-fertilization (hpf), FT 23-59 (201M) increased Mbp expression in 
the PLLn at 5 days post-fertilization (dpf) compared to DMSO-treated 
larvae (Fig. 5c, d; P< 0.05). We also treated gpr126"? larvae, which 
encode a truncated Gpr126 incapable of G, signalling®!°. Mbp was 
barely detectable in both FT23-s9- and DMSO-treated mutant gprl26"? 
larvae (Fig. 5c and Extended Data Fig. 8b). 

We next administered intravenous FT33_59 or FT ce peptide (600 1g) 
to 10-16-week-old Prnp“#3/23 and wild-type mice (1 =8 per group, 
all littermates). FT 3_59, but not FIs, a induced cAMP elevation in 


both Prnp7#3/243 and Prnp7#/2" sciatic nerves 20min after the 
injection; cAMP reached levels similar to those of BL6 mice injected 
with FT S54) (Fig. 5e and Extended Data Fig. 8c). FT23_59 also induced 
a cAMP spike in the heart, which expresses Gpr126 (Fig. 5f and 
Extended Data Fig. 8d) but not in the kidney or brain (Extended Data 
Fig. 8e, f), which do not express Gpr126 (ref. 17). Finally, FT 23-59 but 
not FT}; *) (600,1g) induced a robust cAMP spike in sciatic nerves 
from Gpr126" but not Dhh@"::Gpr126" mice (Fig. 5g). 

Here we report, to our knowledge, the first molecular elucidation 
of a phenotype caused by PrP“ deficiency. While Gpr126 is crucial 
for peripheral nerve myelination during development, the late-onset 
phenotype of Prnp-ablated mice suggests that it has additional roles in 
myelin maintenance. Gpr126~'~ mice exhibit drastic hypomyelination, 
but the Prnp/~ phenotype is relatively mild and late-onset, akin to 
peripheral neuropathies, whose prevalence is 2.4% generally and 8% 
among the aged!*, Perhaps Gpr126 provides basal signalling in Prnp-/~ 
mice through its interaction with type-IV collagen and/or laminin-211 
(refs 2, 16), which may not suffice for long-term myelin maintenance. 
The Gpr126-agonistic properties of systemically administered FT 
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Figure 5 | Myelinotrophic effect of FT in zebrafish and mice. 

a, Transmission electron micrographs of sciatic nerves from 14-month-old 
wild-type BL6 and Prnp“°/7"8 mice (n= 3-4). Thinly myelinated axons 
(black arrow), loss of axon—Schwann cell interaction (boxes), abnormal 
cytoplasmic Schwann cell protrusions (white arrowhead) and initial onion 
bulb formation (asterisk) were observed in Prnp“#?/2° mice. Scale bars, 

500 nm. b, Neuropathic phenotype of nerves from one-year-old 
Dhh©::Gpr126! mice. Left, toluidine blue-stained sections of sciatic nerves 
from control Gpr126"" (phenotypically wild-type) and Dhh“"::Gpr126/ 
(Gpr126*Swann) mice. Gpr126" nerves were well-myelinated (n = 3/3 mice), 
whereas Gpr126°S”™™ nerves exhibited myelin loss with readily apparent 
onion bulb-like structures (arrows; n = 3/3 mice). Right panels: Transmission 
electron micrographs of sciatic nerves from Gpr126" and Gpr126°Sw™ 
mice. Myelinated axons (M) and Remak bundles (R) were found in Gpr126 
mouse sciatic nerves (n = 3/3 mice). Numerous defects were observed in 
Gpr126°S*"" mouse sciatic nerves (n= 3/3 mice) including onion bulbs 
(black arrow), abnormal cytoplasmic protrusions (white arrowheads), and 
loss of axon-Schwann cell interactions (boxes) similar to those seen in 
Prnp“"3/2H3 mice. Scale bars: 20}1m (left), 2 um (right). ¢, gpr126"% 
hypomorphic mutant zebrafish larvae were treated with vehicle (DMSO) or 
FT 3-50 (201M) at 50-55 hpf and the posterior lateral line nerve was 


immunostained at 5 dpf for myelin basic protein (Mbp, green). AcTub: 
acetylated tubulin (red) labelling axons. Scale bar, 201m. The intensity of 
immunofluorescence was assessed by morphometry (right). FT treatment 
enhanced Mbp immunofluorescence without affecting AcTub. d, Mbp 
expression was scored in larvae treated with FT3_59 or vehicle (DMSO). FT 3.50 
treatment resulted in a higher proportion of rescued (some and strong) Mbp 
expression in gpr126"© (53% versus 34%) but not in gpr126"” larvae 

(P< 0.05, Fisher's two-tailed exact test). NS, not significant. n > 25 larvae per 
replicate treatment. e, Prnp“4/2"3 and BL6 mice were intravenously injected 
with either FT 3 so or its non-charged analogue FT5“*y (600 1g per animal, 
20 min). Prnp7"/2"° sciatic nerves showed a significant cAMP increase after 
injection with FT)3_59, but not FT35_¢9. FT23-s-treated Prnp2"/2"8 sciatic 
nerves reached cAMP levels similar to those of BL/6 mice injected with 

FT‘, °4,. Each dot represents an individual animal. f, Heart cAMP levels were 
also increased in FT23_59-injected mice. g, Control Gpr126 (WT) and 
Dhh:;Gpr126"' mutant (Gpr126°S”*™") mice were intravenously injected 
with either FT 3-59 or FT35_“sy (600 ig per animal). Sciatic nerves were 
isolated 20 min after injection. FT3-s» elicited a significant cAMP increase in 
wild-type mice, but not in Gpr1264S*“"" mice. FT “Syinjection did not alter 
cAMP levels (n= 3). Error bars show s.e.m. Unpaired Student's t-test was used 
for analysis. *P < 0.05; **P< 0.01; ***P<0.001. 
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peptides suggest that soluble ligands may be beneficial in diseases 
caused by hypomorphic Gpr126 mutations!”, and perhaps also in other 
hereditary motor-sensory neuropathies. 

A moderately conserved homology region between PrP and Col4 
proved essential to the biological activity of both proteins, whereas 
laminin-211 might activate Gpr126 through different mechanisms. 
Although Gpr126 is not expressed in the central nervous system, cer- 
tain mutants of PrP© cause myelin pathology in vivo’. This observation 
raises the question of whether inappropriate GPCR activation may play 
a role in prion diseases within the CNS. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Mice. Mice were bred in specified-pathogen-free facilities at the University 
Hospital Zurich and Washington University, and housed in groups of 3-5, under 
a 12h light/12h dark cycle (from 7 a.m. to 7 p.m.) at 21+1°C, with sterilized 
chow food (Kliba No. 3431, Provimi Kliba) and water ad libitum. Animal care 
and experimental protocols were in accordance with the Swiss Animal Protection 
Law, and approved by the Veterinary Office of the Canton of Zurich (permits 123, 
130/2008, 41/2012 and 90/2013). The following mice were used in the present 
study: C57BL/6J, Prnp“"/2") (ref. 3), co-isogenic C57BL/6) Prnp“/2"8 and 
Prnp“™ control mice® and Schwann cell-specife Dhh“::Gpr126"! mutants**. 
Mice of both genders were used for experiments unless specified. Archival tissues 
from previous studies® were also analysed in the current study. 

No statistical methods were used to predetermine sample size. The experiments 
were not randomized and the investigators were not blinded to allocation during 
experiments and outcome assessment except where stated. 

Primary Schwann cell culture. Sciatic nerves from postnatal day 2-5 were 
dissected using microsurgical techniques. Nerves were dissociated in serum-free 
DMEM supplemented with 0.05% collagenase IV (Worthington) for 1h in the 
incubator. Sciatic nerves were mechanically dissociated using fire-polished Pasteur 
pipettes. Cells were filtered in a 40-{1M cell strainer and washed in Schwann cell 
culture medium (DMEM, Pen-Strep, Glutamax, FBS 10%) by centrifugation at 
300g for 10 min. Resuspended cells were plated on 3.5 cm Petri dishes previously 
coated with poly-L-lysine 0.01% (w/v) and laminin (1 mg/ml). Laminin (Cat. No: 
12020; from Engelbreth-Holm-Swarm murine sarcoma basement membrane) and 
poly- L-lysine were obtained from Sigma-Aldrich. 

Purification of recombinant proteins. Full-length recombinant PrP (recPrP, 
residues 23-231) and globular domain (GD, residues 121-231) were purified as 
previously described”!-**. The generation of the GST fusion FT-PrP expression 
vector (pGEX-KG FT-PrP) was described previously; a modified purification 
protocol was used‘. The FT-PrP expression vector was transformed into BL21 
(DE3) strain of Escherichia coli (Invitrogen). Bacteria were grown in Luria-Bertani 
medium to an OD of 0.6, and the expression of the fusion protein was induced with 
0.5 mM isopropyl-1-thio-8-p-galactopyranoside (AppliChem). Cells were then 
grown for another 4h at 37°C and 100 rpm shaking. Cells were pelleted at 5,000g¢ 
for 20 min at 4°C (Sorvall centrifuge, DuPont). The pellet was resuspended on ice 
in lysis buffer (phosphate-buffered saline supplemented with complete protease 
inhibitors (EDTA-free, Roche), phenylmethyl] sulfonyl] fluoride (Sigma) and 150 1M 
lysozyme (Sigma)) and incubated on ice for 30 min. Triton-X 100 (1%), MgCl 
(10mM) and DNase I (5,1g/ml, Roche) were added, and the lysate was incubated 
on ice for 30 min. The lysate was than centrifuged for 20 min at 10,000g at 4°C. 
Glutathione sepharose beads were washed with PBS and incubated with the cell 
lysate for 1h at 4°C on a rotating device. Beads were packed into a column and 
washed with PBS until a stable baseline was reached as monitored by absorbance at 
Aogo using an AKTAprime (GE healthcare). The fusion protein was cleaved on the 
beads with 5 U/ml Thrombin (GE Healthcare) for 1 h at room temperature under 
agitation. For thrombin removal, benzamidine sepharose beads were added and 
incubated for 1h at 4°C on a rotating wheel. Protein preparations were analysed by 
12% NuPAGE gels followed by Coomassie- or silver-staining. To achieve a higher 
purity of the protein, we next applied the protein to a sulfopropyl (SP) sepharose 
column equilibrated with 50 mM Tris-HCl buffer, pH 8.5. Elution was performed 
with a linear NaCl gradient of 0-1,000 mM. Fractions containing the protein were 
collected and concentrated (AMICON; MWCO 3500). The protein was then 
injected in 5001] portions into a size-exclusion chromatography system (TSK- 
GEL G2000SW x, column (Tosoh Bioscience)) and eluted with a linear gradient 
using PBS. Pure fractions were combined, concentrated and stored at —20°C. The 
purity of FT-PrP was >95-98% as judged by a silver-stained 12% NuPAGE gel. 
Cell culture. SW10 cells and clones derived from them were all grown in DMEM 
medium supplemented with 10% fetal bovine serum (FBS), penicillin-streptomycin 
and Glutamax (all obtained from Invitrogen). HEK293T cells, its clonal variant 
HEK293(H) cells and clones derived therefrom overexpressing various GPCRs 
were grown in DMEM-F12 medium supplemented with 10% FCS, penicillin- 
streptomycin and Glutamax (all obtained from Invitrogen). All cell lines were 
regularly monitored for mycoplasma contamination. The authenticity of SW10 
and its derivatives was established by monitoring the expression of Schwann-cell 
specific markers (Extended Data Fig. 6a). Human Gpr126 (NM_020455), Gpr124, 
Gpr64, Gpr56, Gpr133, Gpr56 and Gpr176 expression plasmids (pCGpr126-V5, 
pCGpr124-V5, pCGpr65-V5, pCGpr56-V5, pCGpr133-V5, pCGpr56-V5 and 
pCGpr176-V5) were generated by PCR amplification of the respective cDNA 
followed by TOPO cloning into the pCDNA3.1/V5-His-TOPO vector. The cDNA 
was in frame with the V5 tag (sequence: GEPIPNPLLGLDST) at the C terminus. 
HEK@?6 and HEK@?®!6 cells were generated by transfecting 1 jig of plasmid 
into one well of a subconfluent 6-well plate using 3,11 Fugene (Roche) according 
to the manufacturer’s protocol. Twenty-four hours after transfection, cells were 
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transferred to a 10-cm dish and grown in selective medium containing 0.4mg/ml 
G418 (Invitrogen) until emergence of resistant colonies. A limiting dilution 
was carried out to obtain clonal lines. Membrane expression of the transgene 
was assessed in the selected clones by confocal microscopy using 1:100 diluted 
anti-V5 antibody (Invitrogen) and the Cytofix/Cytoperm kit (Pharmingen Cat. 
Nr. 554714), according to the manufacturer's protocol. 

Cerebellar granule neuronal culture. Cerebellar granule neurons were generated 
from 7-8-day-old Prnp“"/™ mice as described previously”®. Cultures were plated 
at 350,000 cells per cm? in Basal Medium Eagle (BME) (Invitrogen) with 10% (v/v) 
FCS and maintained at 37 °C in 5% CO3. 

Plasmids and transfections. pCDNA-PrP© was generated by cloning murine PrP© 
into pCDNA3.1 vector as described previously”®. A site-specific mutagenesis kit 
(Stratagene) was used to induce alanine substitutions of QPSPG and KKRPK 
domains in PrP©. Primers used for generating the Ala-QPSPG plasmid were: 
forward, GTG GAA GCC GGT ATC CCG GGG CGG CAG CCG CTG CAG 
GCA ACC GTT ACC CG; reverse, GGG TAA CGG TTG CCT GCA GCG GCT 
GCC GCC CCG GGA TAC CGG CTT CCA C. Primers for Ala-KKRPK were: 
forward, CTA TGT GGA CTG ATG TCG GCC TCT GCG CAG CGG CGC CAG 
CGC CTG GAG GGT GGA ACA CCG; reverse, CGG TGT TCC ACC CTC CAG 
GCG CTG GCG CCG CTG CGC AGA GGC CGA CAT CAG TCC ACA TAG. 
Transfections were performed with Lipofectamine 2000 (Invitrogen) according 
to the manufacturer’s protocol. 3 1g of DNA was used per well of a 6-well plate. 
Cells were washed 24h after transfection using PBS, and fresh medium was added 
to the cells. 

Immunoprecipitation. HEK293T and HEK®?®!”6 cells growing in T75 flasks 
at 50% density were treated with recombinant FT or GD (21M, 20 min). Cells 
were washed twice in PBS and lysed in IP buffer: 1% Triton X-100 in PBS, 
1x protease inhibitors (Roche) and Phospho stop (Roche) for 20 min on ice 
followed by centrifugation at 5000 rpm for 5 min at 4°C. BCA assays were 
performed to quantify the amount of protein, and 500 ,g of protein was used for 
immunoprecipitations. 2 1g anti-V5 antibody was added to the cell lysate and 
incubated on a wheel rotator overnight at 4°C. On the following day, Protein G 
dynabeads (Invitrogen) were added to the samples and incubated for a further 
3h on the wheel at 4°C. Beads were washed three times for 5 min each using 
the IP buffer followed by addition of 2x sample buffer containing DTT (1mM 
final). Samples were heated at 95°C for 5 min, loaded on 4-12% Novex Bis-tris 
gels (Invitrogen), and migrated for 1.5h at 150 V followed by western blotting. 
Immunoprecipitations were performed by adding 2 1g of POM2 antibody to 500 j1l 
of cell medium and incubating overnight on a wheel rotator at 4°C. Protein G beads 
were then added, and incubation on a wheel rotator at 4°C was performed again. 
RNA isolation and quantitative PCR. RNA extraction and quantitative 
PCR were performed as described previously’. The following primers were 
used: EGR2 forward: 5’/-AATGGCTTGGGACTGACTTG-3’; EGR2 reverse: 
5'-GCCAGAGAAACCTCCATT-3’; GAPDH forward: 5’-CCACCCCAGCA 
AGGAGAC-3’; GAPDH reverse: 5’-GAAATTGTGAGGGAGATGCT-3’. 
Zebrafish mutant strains and analysis. Adult zebrafish were maintained in the 
Washington University Zebrafish Consortium facility (http://zebrafishfacility. 
wustl.edu/) and all experiments were performed in compliance with institutional 
protocols. Embryos were collected from harem matings or in vitro fertilization, 
raised at 28.5 °C, and staged according to standard protocols”’. The gpr126"” and 
gpr126" mutants were described previously”. 

Zebrafish peptide treatment, immunostaining, and quantification. gpr12 
or gpr126"” mutants were collected from homozygous mutant crosses and wild- 
type larvae were collected from AB* strain crosses and raised to 50 hpf. FT 23-50 
treatment of gpr126 mutants was performed as previously described». Briefly, egg 
water was replaced with either 201M FT»3_s9 in egg water or egg water containing 
an equivalent volume of DMSO. At 55 hpf, larvae were washed twice and raised 
in egg water to 5 dpf. Wild-type and gpr126 larvae were fixed in 2% paraformal- 
dehyde plus 1% tricholoroacetic acid in phosphate buffered saline, and Mbp and 
acetylated tubulin immunostaining was performed as described previously*”*. 
Expression scoring was performed with observers blinded to treatment according 
to the following rubric: strong, strong and consistent expression throughout PLLn; 
some, weak but consistent expression in PLLn; weak, weak and patchy expression 
in PLLn; none, no expression in PLLn. n= three independent replicate gpr126"* 
assays and one gpr126"“? assay. n= 87 DMSO-treated gpr126"® larvae, 81 Prp- 
FT-treated gpr126" larvae, 27 DMSO-treated gpr126”” larvae, 25 Prp-FT-treated 
gpr126" larvae. 

Fluorescent nerve images were analysed using the Fiji software”. A rectangular 
region-of-interest (ROI) was drawn longitudinally over the fluorescent nerve. 
The longitudinal grey-scale histogram of the myelin basic protein (Mbp) was 
normalized pixel-by-pixel to the corresponding intensity of the acetylated tubu- 
lin (AcTub). The size of the measured ROIs was kept constant across different 
treatment modalities. 
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Flow cytometry. SW10 cells were grown in P75 flasks at 50% density, rinsed 
with PBS, and detached from culture flasks with dissociation buffer containing 
EDTA (GIBCO). After detaching, cells were washed to remove residual EDTA and 
counted using a Neubauer chamber. Batches of 10° SW10 cells were transferred to 
FACS tubes, treated with HA-tagged recombinant peptides for 20 min, washed, and 
incubated with Alexa-488 conjugated anti-HA antibody for 30 min. After further 
washes and centrifugations, cells were resuspended in 2001] FACS buffer (PBS + 
10% FBS) and analysed with a FACS Canto II cytofluorimeter (BD Biosciences). 
Data were analysed using FloJo software. 

Western blot analysis. Schwann cells were lysed in cell-lysis buffer (Tris-HCl 
20mM, NaCl 137 mM, Triton-X-100 1%) supplemented with protease inhibitor 
cocktail (Roche complete mini). The lysate was homogenized by passing several 
times through a 26G syringe, and cleared by centrifugation at 8,000g, 4°C for 2 min. 
in a tabletop centrifuge (Eppendorf 5415R). Protein concentration was measured 
with the BCA assay (Thermo Scientific). 10 jg total protein was boiled in 4 x LDS 
(Invitrogen) at 95°C for 5 min. After a short centrifugation, samples were loaded on 
a gradient of 4~12% Novex Bis-Tris Gel (Invitrogen) for electrophoresis at constant 
voltage of 200 V. Gels were transferred to PVDF membranes with the iBlot system 
(Life technologies). Membranes were blocked with 5% Top-Block (Sigma) in 
PBS-T for 1h at room temperature. Primary antibody was incubated overnight in 
PBS-T with 5% Top-Block. Membranes were washed three times with PBS-T for 
10 min and incubated for 1h with secondary antibodies coupled to horseradish 
peroxidase at room temperature. After three washes with PBS-T, the membranes 
were developed with a Crescendo chemiluminescence substrate system (Millipore). 
Signals were detected using a Stella 3200 imaging system (Raytest). 

Antibodies. Monoclonal antibodies against PrP© were obtained and used as 
described previously*. Fab3 and Fab71 antibodies were generated using the phage 
display technology and their epitopes were mapped with overlapping peptides. Anti 
AKT, p-AKT were obtained from Cell signaling and used at 1:2,000 dilutions for 
western blotting. The anti-p75NGF receptor antibody was obtained from Abcam 
and used at a 1:200 dilution for immunofluorescence. Anti V5 antibody was from 
Invitrogen and used at a dilution of 1:500 for western blot and 21g antibody was 
used for immunoprecipitation on 500 1g of cell lysate. 

Cyclic AMP measurements. In the direct cAMP ELISA assay, cAMP levels were 
assessed with a colorimetric competitive immunoassay (Enzo Life Sciences). 
Quantitative determination of intracellular cAMP was performed in cells or tissues 
lysed in 0.1 M HCI to stop endogenous phosphodiesterase activity and to stabilize 
the released cAMP. SW10 or HEK293T cells (100,000 cells per well) were plated in 
6-well plates to ~50% density. Cells were treated with conditioned medium or recom- 
binant peptides (21M, unless specified) for 20 min unless otherwise mentioned. Cells 
were lysed with 0.1 M HCl lysis buffer (Direct cAMP ELISA kit, Enzo). To ensure 
complete detachment of cells, cell scrapers were used. Lysates were homogenized 
with a 26G needle and syringe before clearing by centrifugation at 600g for 10 min. 
The subsequent steps were performed according to the manufacturer's protocol based 
on competition of sample cAMP with a cAMP-alkaline phosphatase conjugate. To 
measure in vivo cAMP changes, BL6, Prnp 3/283 or Prnp7hv2n mice were 
intravenously injected with 600 1g of either FT 23-5 or, as a control, uncharged 
FT 3.59 (FT, *4)). Twenty minutes after infusion, mice were killed and all organs 
were collected. For cAMP assays, organs were homogenized in 0.1 M HCl. 
Subsequent steps were performed according to the manufacturer’s protocols as 
described above. Cyclic AMP levels were calculated using a cAMP standard curve 
in the case of ELISA based assay. Finally, cAMP concentrations were normalized 
to total protein content in each sample. cAMP changes are represented as fold 
changes to the respective controls. For each experiment, at least three independent 
biological replicates were used. For in vivo assays, groups of 8-16 mice were used 
for each experiment. For normalization purposes, the median value of the 
respective control sample was defined as 1. All measurements within each panel 
were normalized to this control value. For in vivo assays, sample sets were coded 
and investigators were blinded to their identities. The assignment of codes to 
sample identities was performed only after the cAMP values were plotted for 
each set. 

Generation of Schwann cell lines devoid of Gpr126. We designed two 
CRISPR short-guide RNA (sgRNAs) against exon 2 of Gpr126 (upper Guide 
CCTGTGTTCCTCTCTCAGGT and lower Guide AACAGGAACAGCAGG 
GCGCT). The DNA sequences corresponding to the sgRNAs were cloned into 
expression plasmids and transfected with EGFP-expressing Cas9-nickase plasmids. 
Single EGFP-expressing Schwann cells were isolated with a FACS sorter (Aria III). 
To determine the exact sequence of indels induced by genome editing, we amplified 


the sgRNA-targeted locus by PCR and subcloned the fragments into blunt-TOPO 
vectors. Ten colonies per cell line were sequenced and showed distinct indels on 
each allele. A clonal subline devoid of Gpr126 was used for further studies. This cell 
line possessed insertions on both the alleles; a 49-bp insertion at position 118 anda 
5-bp insertion at position 84 on each allele. Both insertions led to a frameshift and 
to the generation of premature stop codons leading to early translation termination. 
Promoter luciferase assay. Luciferase reporter constructs were generated 
containing a 1.3-kB sequence upstream of the transcription-starting site of 
Egr2. SW10 Schwann cells were transfected with Egr2 reporter construct and a 
renilla plasmid using lipofectamine 2000. After one day in vitro, Schwann cells 
were treated with recombinant full-length PrP (23-231), the globular domain 
of PrP (121-231) or PBS control. Luciferase activity was measured 24h after 
stimulation with Dual-Luciferase Reporter Assay System (Promega) according 
to the manufacturer’s recommendations. Results were normalized to renilla 
transfection controls. 

Immunocytochemistry. Glass coverslips were placed in 12-well plates (Thermo 
Scientific) and coated with 0.01% w/v Poly-L-lysine solution (Sigma) overnight 
at room temperature. Coverslips were washed three times with ddH,O and dried 
for 2h in a laminar-flow hood. Schwann cells were seeded and cultured at 50% 
density. Cells were treated with recombinant FT-PrP, full length recPrP or C1-PrP 
for 20 min, and washed with serum-free DMEM. Cells were further washed with 
PBS followed by fixation with 4% paraformaldehyde. Fixed cells were incubated in 
blocking buffer (PBS+10% FBS) for 1h. Cells were treated with various primary 
antibodies followed by washes and incubation with Alexa 488 and Alexa 647 tagged 
rabbit or mouse secondary antibodies (Life Technologies). Imaging was performed 
by Leica SP2 confocal microscope using a 20x objective; images were processed 
by Image J software. 

Transmission electron microscopy. Transmission electron microscopy was 
performed as previously described®. Briefly, mice under deep anaesthesia were 
subjected to transcardial perfusion with PBS heparin and sciatic nerves were fixed 
in situ with 2.5% glutaraldehyde plus 2% paraformaldehyde in 0.1 M phosphate 
buffer, pH 7.4 and embedded in Epon. Ultrathin sections were mounted on copper 
grids coated with Formvar membrane and contrasted with uranyl acetate/lead 
citrate. Micrographs were acquired using a Hitachi H-7650 electron microscope 
(Hitachi High-Tech, Japan) operating at 80kV. Brightness and contrast were 
adjusted using Photoshop. For quantification of Remak bundles and onion bulb- 
like structures, images were captured at 1,500 magnification and axon numbers 
and abnormal onion bulb-like structures were counted manually. Quantification 
was performed in a blinded fashion by assigning numbers to the images and upon 
completion of quantification genotypes were revealed. 

Recombinant peptides. HA-tagged and untagged synthetic peptides were 
produced by EZ Biosciences. A stock solution of 2mM was prepared by dissolving 
the peptides in PBS and they were used at a final concentration of 21M unless 
specified. The sequences of all the peptides used in this study can be found in 
Extended Data Table 1. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | FT binds to Schwann cells through 

a proteinaceous receptor. a, Primary Schwann cells were isolated 

from the sciatic nerves of Prnp“!"/7"! mice and grown on coverslips. 
Cells were exposed for 20 min to recombinant PrP°, FT, or GD (2 uM), 
fixed, and stained with POM2 (FT, PrP©) or POM1 (GD). Antibodies were 
visualized in the green channel and nuclei were stained with DAPI 

(blue). PrP© and FT, but not GD, adhered to the cells. Scale bars, 251m. 
b, Schematic representation of the target region for transcription activator- 
like endonucleases (TALEN) in the Prnp gene. Target guides are indicated 
by arrows. Gene editing resulted in a deletion leading to a frame shift in 
the PrP© coding sequence (designated as a conflict in the figure) anda 
premature stop codon identified by sequencing. c, Wild-type SW 10 cells 
and a subclone isolated after treatment with TALEN (SW10,p,;p) were 
probed by western blotting using POM1. SW10,p;p showed complete 
abrogation of PrP expression and was used for further experiments. 
Levels of actin on the same membrane were monitored to confirm equal 
loading of cell lysates onto the gel. For uncropped gels see Supplementary 
Information File 1. d, e, SW10,p;p cells were treated with full-length 
recombinant (PrP°, residues 23-231), flexible tail (FT, 23-110), or 
globular domain (GD, 121-231). PrP epitopes were detected with POM2 
(d) or POM1 (e, red). Grey, DAPI. As expected, FT was detected only by 
POM2. Cells were also labelled with antibodies to the p75 nerve-growth 


factor receptor (yellow), a Schwann cell marker. PrP© and FT, but not GD, 
adhered to Schwann cells. Scale bar, 261m. f, The PrP©-deficient cell line 
HpL? was treated with recombinant PrP®, FT, and GD as in a. None of the 
recombinant proteins adhered to HpL cells. Scale bars, 201m. g, SW10 
cells were trypsinized, washed, and mixed with non-trypsinized SW10 
cells labelled with Deep Red cell tracker. Cells were incubated with HA- 
tagged peptide FT 3-59, and binding was visualized by flow cytometry. The 
Deep Red signal (abscissa) was used to differentiate trypsinized from non- 
trypsinized cells. 51% of untreated cells, but only 5% of trypsinized cells, 
became decorated by FT23_50.-HA, indicating that FT23-s9 reacted with 
trypsin-sensitive surface molecules. h, SW10 cells were digested (30 min) 
with phosphatidylinositol phospholipase C (PI-PLC, 0.5 U), washed, and 
incubated with FT>3_5>—-HA along with undigested Deep Red-labelled cells 
(left). The proportion of binders in the digested (34%) and undigested 
samples (30%) was similar, indicating that the FT23_59 receptor was neither 
PrP° itself nor any other GPI-linked protein. To monitor the efficiency of 
PI-PLC treatment, we assessed POM2 binding to PrP on both treated and 
untreated cells (right). POM2 binding was significantly decreased in 
PI-PLC treated cells (23%) compared to untreated cells (90%). Panels 
depict biologically independent triplicates; unpaired Student's t-test was 
used for statistical analysis. 
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Extended Data Figure 2 | Prnp ablation results in lower cAMP levels 

in sciatic nerves. a, b, cAMP was measured in sciatic nerves isolated from 
4-day-old (a) or 4-week-old (b) BL6 and Prnp“#/24! mice. No difference 
was observed in cAMP levels in 4-day-old mice, whereas 4-week-old 
Prnp“"/24! mice displayed a trend towards decreased cAMP levels. 

c, Sciatic nerves from 10-week-old Prnp“"/"8 mice showed a significant 
decrease in cAMP (P=0.0151). d, PrP© expression by neurons (fgNSE- 
PrP), but not by Schwann cells (tgMBP-PrP), restored cAMP levels in 
sciatic nerves of 10-16-week-old mice. Each dot represents one mouse. 
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e, SW10 cells were seeded in 6-well plates and treated (20 min) with 
recombinant FT or GD (101M). Addition of FT, but not of GD, resulted 

in a concentration-dependent intracellular cAMP increase. f, Primary 
Prnp“"/241 Schwann cells were treated with FT (20 min), and cAMP 
concentrations were determined by immunoassay. A dose-response curve 
was interpolated. Data are representative of three biologically independent 
experiments and statistical significance was evaluated by unpaired 
Student’s t-tests. Error bars show s.e.m. 
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Extended Data Figure 3 | See next page for caption. 
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Extended Data Figure 3 | FT proteolytically released from full-length 
PrP© binds to Gpr126. a, HEK293T cells were transfected with an 

empty vector (HEK*™") or a plasmid expressing murine PrP© (HEK"**), 
Cell medium was collected 48h after transfection and subjected to 
immunoprecipitation with monoclonal antibody POM2 (against PrP°), 
followed by western blotting using biotinylated POM2 and streptavidin- 
HRP. FT was observed only in the medium from HEK?"® cells. b, ET 
released into the conditioned medium of HEK™”? was immunoprecipitated 
using POM2 and visualized by western blotting with biotinylated POM2. 
Various amounts of recombinant FT (3.125-100 ng) were used for 
calibration, and the concentration of FT released into 1 ml of the 

medium upon immunoprecipitation was estimated to be 37 ng ml !. 

c, Conditioned medium from primary BL6 Schwann cells cultures 
(PSC#"°) was subjected to immunoprecipitation with antibody POM2 
followed by western blotting with POM2. For control, we used conditioned 
medium from HEK cells transfected with a non-coding plasmid 
(HEK*™?") or a with a plasmid encoding murine PrP© (HEK"*”). FT was 
detected only in conditioned medium from HEK"” cells (lane 2) but not 
in conditioned medium from two independent PSC®" cultures (lanes 3 
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and 4). Asterisks denote immunoglobulins detected by the secondary 
antibody. d, Sciatic nerve lysates obtained from 10-week-old Prnp74/2™ 
and BL6 mice and subjected to immunoprecipitation with POM2 antibody 
followed by western blotting with POM2. Full-length PrPS, but no FT, was 
detectable in the immunoprecipitates of nerves from BL6 mice. e, Wild- 
type HEK293T cells (HEK") or HEK293T cells overexpressing various 
GPCRs bearing V5 epitope tags (HEK9™!?°, HEK@?™4, and HEK@?"!”°) 
were grown on coverslips and stained with anti V5 antibody (detecting 
tagged GPCRs; magenta). Nuclei were stained with DAPI (blue). Staining 
revealed cell surface expression of all transfected GPCRs. Scale bar, 

8m. f, HA-tagged FT 3-59 peptide (2 1M) was added to HEK"' cells 

or to HEK®?PR!”6 cells, labelled with anti-HA antibody, and subjected to 
cytofluorimetry. Overexpression of Gpr126 increased the binding of 

FT 23-50. g, Binding of HA-tagged FT 2359 to HEK@?!6 cells (right, 
monitored by cytofluorimetry) was conspicuously increased over that 

of wild-type, Gpr176, and Gpr124-overexpressing HEK293T cells. Data 
are representative of three biologically independent experiments; 
statistical significance was evaluated by unpaired Student's t-test. Error 
bars show s.e.m. 
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Extended Data Figure 4 | See next page for caption. 
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Extended Data Figure 4 | FT binds selectively to Gpr126 and induces 
cAMP. a, HEK293T, HEK@?®"4 and HEK@?®!6 cells were exposed 

(20 min) to recombinant FT, GD (21M), or PBS, and subjected to 
immunoprecipitation using the anti-V5 antibody, followed by western 
blotting using POM2, anti-V5 or POM1. Anti-V5 detected full-length 
Gpr126, Gpr124 (denoted as GprV5 for both proteins) and the respective 
C-terminal fragments (Gpr126V5-CTE, Gpr124V5-CTE). POM2 revealed 


a band corresponding to the FT (lane 3) that co-precipitated with GPR126. 


POM1 indicated that GD did not bind. Lanes 1, 2 and 3: HEK@?®!”° cells 
treated with PBS, GD and FT, respectively. Lanes 4, 5 and 6: HEKCP®!4 
cells treated with PBS, GD and FT, respectively. Lanes 7, 8 and 9: 
HEK293T cells treated with PBS, GD and FT, respectively. Asterisks: 
immunoglobulin heavy and light chains. For uncropped gels see 
Supplementary Information File 1. b, SW10,acpriz6 cells plated at a density 
of 100,000 cells per well in 6-well plates were transfected with control 
plasmid (pCDNA3) or plasmids encoding various GPCRs (Gpr126, 124, 
176, and 56) bearing C-terminal V5 tags. Only cells transfected with 
pCGpr126-V5 showed a cAMP response to FT33_59 48 h post transfection. 
PBS treatment was used for control. c, Intracellular cAMP responses to FT 
treatment (21M, 20 min) in SW10 and SW10,cpri26 cells, as well as 


SW10K Gre cells expressing V5-tagged human Gpr126 (pCGpr126-V5). 
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A significant increase in cAMP was observed in SW10 cells, whereas 

SW 10,cpri26 Showed no change. In contrast, SW1ON Gre cells showed a 
significant cAMP increase, indicative of successful complementation. 

d, SW10 and SW10agpriz6 cells were incubated (20 min) with conditioned 
medium from HEK*™?Y or HEK""” cells. HEK?"?-conditioned medium 
induced a robust cAMP spike in SW10 but not SW10Acpriz6 cells. e, SW10 
and SW10,cpri26 Cells were grown on coverslips for 24h and exposed to 
recombinant FT (21M, 20 min). Cells were stained with POM2 (detecting 
FT, red; DAPI-stained nuclei, grey) and antibodies to pe (yellow). 
Deletion of Gpr126 largely suppressed FT binding. Scale bar, 26 1m. 

f, HEK293(H) cell lines were transfected with plasmids expressing 
different adhesion GPCRs (Gpr: 97, 133, 64, 56), followed by selection of 
cells expressing the receptor in presence of geneticin. GPCR-expressing 
cells and HEK®?"!6 cells were then treated with either FT 33-50 or FT5 4) 
for control (FT and C, respectively). Only cells expressing Gpr126 
responded to FT 3-59 with a cAMP spike. Interestingly, cells expressing 
Gpr133 reacted with a decrease in cAMP levels. Data are representative 

of three biologically independent experiments; statistical significance was 
evaluated by unpaired Student’s t-test. Error bars show s.e.m. 
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Extended Data Figure 5 | See next page for caption. 
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Extended Data Figure 5 | FT promotes signal transduction in a Gpr126 
dependent manner. a, HEK293T cells were transfected with increasing 
amounts of human Gpr126 plasmid (2-5 yg per well of a 6-well plate). 

48h post transfection, cells were treated with FT 3-59 or PBS as a control. 
Increasing amounts of Gpr126 cDNA did not result in amplification of the 
cAMP signal. b, Primary Prnp7"/“™ cerebellar granule neuron cultures 
were seeded in 6-well plates at a density of 5 x 10° cells per well and treated 
with FT 23-59, FT35_4y, or PBS. No alterations in the levels of cAMP were 
noticed. c, SW10 cells were exposed to conditioned medium from HEK 
cells that had been transfected with empty vector (HEK*™P") or a PrP© 
expression vector (HEK?"?). HEK""? were optionally treated with 100 1M 
of the TAPI-2 protease inhibitor for 24h before harvesting the medium. 
TAPI-2 treatment resulted in reduced cAMP induction, suggesting that 
impaired proteolytic cleavage of the FT from PrP“ resulted in decreased 
signalling. d, Quantification of FT released into the medium relative to the 
total amount of PrP° in lysates by western blotting. The spent medium of 
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HEK"” cells treated with TAPI-2 contained less FT. e, SW10 and 
SW10,cpriz6 cells were transfected with an Egr2-controlled firefly 
luciferase reporter and treated with recombinant FT (21M) or PBS (24h). 
Ordinate: luciferase expression normalized to a renilla luciferase control 
(n= 3; *P < 0.05; t-test). Luciferase activity was observed only in SW10 
cells stimulated with FT but not in SW10agpriz6 cells. f, Primary Schwann 
cells were exposed to recombinant FT (21M, 1h) or PBS. Egr2 mRNA 
expression was measured by quantitative RT-PCR and normalized against 
a panel of housekeeping genes. For uncropped gels, see Supplementary 
Information File 1. g, SW10,prp and SW10agpriz6 cells were grown in 
6-well plates, exposed to recombinant FT (<30 min), and analysed by 
western blotting (left). Densitometry (right) showed increased phospho- 
AKT/AKT ratio in SW10,p;p cells, but not in SW10,gGpriz6 cells. Data are 
representative of three biologically independent experiments; statistical 
significance was evaluated by unpaired Student's t-test. Error bars show 
s.e.m. 
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Extended Data Figure 6 | See next page for caption. 
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Extended Data Figure 6 | A domain conserved between FT and 
collagen-IV is required for cAMP induction. a, SW10, SW10,p;p and 
SW 10,Gpriz6 cells were grown on coverslips and stained with antibodies 
against myelin-associated glycoprotein (MAG), myelin oligodendrocyte 
glycoprotein (MOG), glial fibrillary acidic protein (GFAP) and p75 nerve 
growth factor receptor (p75NGER) (left, all green; DAPI-stained nuclei, 
blue). Cells labelled with secondary antibody alone (2° Ab) were used as 
controls to determine unspecific staining. Scale bars, 101m. Expression in 
all cell lines was confirmed by western blotting (right). Lysate from 
HEK293T wild-type cells (HEK") was used as control. All proteins 
except MBP were expressed in SW10 cells and its derivatives. For 
uncropped gels, see Supplementary Information File 2. b, Western blot 
(developed with POM2) of HEK 293T cells transfected with expression 
plasmids for wild-type murine PrP© or for PrP© bearing lysine-to-alanine 
substitutions in the KKRPK and QGSPG motifs (lanes 3 and 4, 
respectively). The mutations did not affect the biogenesis and processing 
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of PrP© c, Western blot of medium collected from the cells shown in b. 
FT fragments bearing the mutations were released into the medium 
similarly to wild-type FT. For uncropped gels, see Supplementary 
Information File 1. d, SW10,p;p cells were treated with conditioned 
medium from HEK293T cells transfected with an empty vector 
(HEK*™?9), with PrP© (HEK?”), or with full-length PrP© versions in 
which the QGSPG (HEK$*Ocspc) Or KKRPK (HEKSkxapx) motifs were 
substituted (>°) with alanines. The charge neutralization within the 
KKRPK motif abrogated the cAMP induction. e, SW10,p;p cells were 
treated with FT 3 59 (21M) or a Col4-derived 21-meric synthetic peptide 
containing either the GPRGKPG domain or its alanine-substituted variant 
(AAAGAAG; both 811M). Both FT 3-59 and the native Col4 peptide, but 
not the alanine-substituted peptide (Ala-Col4), induced cAMP. Data are 
representative of three biologically independent experiments; statistical 
significance was evaluated by unpaired Student's t-test. Error bars show 
s.e.m. 
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Extended Data Figure 7 | Prnp7"3/243 and Gpr12645"*"" mice display 
comparable demyelination phenotypes. a, Transmission electron 


micrographs of sciatic nerves from 14-month-old Prnp7"/2"! mice (ZH1). 


Black arrowhead, thinly myelinated axons; white arrowhead, abnormal 
cytoplasmic Schwann cell protrusions; boxes, loss of axon—Schwann cell 
interactions; asterisk, initial onion bulb formation. Scale bar, 2 1m 

in upper left panel; 500 nm in all other panels. b, c, Quantification of 
unmyelinated axons in Remak bundles was performed manually by 
counting the number of axons in the bundles from electron microscopy 
images (1,500 magnification, 10 images per mouse were analysed and 
three mice per genotype were used in total). The bundles were further 
sorted into three categories: <10 axons, 10-20 axons and >20axons per 
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bundle. Comparisons were performed between either BL6 and Prap7#3/243 
(b) or Gprl 26! (WT) and Dhh@::Gpr126/It (Gpr126Schwann) mice 

(c; all mice were 13 months old). Both Prap74/2"3 and Gpr126°Schwann 
mice showed a similar inclination towards a decrease in the number of 
axons per bundle. Statistical significance was established by performing 

a two-way ANOVA with Bonferroni correction. d, Onion bulb-like 
structures were quantified from electron microscopy images (1,500 x 
magnification, 10 images per mouse were analysed and three mice per 
genotype were used in total) of either BL6 and Prnp7#/2"3 or Gpr126“/1 
(WT) and Dhh&::Gpr1 26"! (Gprl 2648chwann) mice. These onion bulb-like 
structures were prevalent only in Prnp“#3/249 and Gpr1264S*"*"" mice, 
with Gpr126°S"™ exhibiting more. Error bars show s.e.m. 
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Extended Data Figure 8 | FT is myelinotrophic in both zebrafish and 
mice. a, Immunofluorescence for Mbp (green) in the posterior lateral line 
nerve of wild-type zebrafish larvae. AcTub: acetylated tubulin (red) 
labelling axons. Scale bar, 20pm. b, gpr126““? hypomorphic mutant larvae 
were treated with vehicle (DMSO) or FT 23-59 (20 1M) at 50-55 hpf and 
immunostained at 5 dpf for Mbp (green). AcTub: acetylated tubulin 

(red) labelling axons. Scale bar, 20 jum. FT 23-59 did not alter Mbp 
immunofluorescence. ¢, FT 23-59 or FI'SS”*, was intravenously administered 
to Prnp*#/24! and BL6 mice (600 ,.g per mouse, 20 min). After FT 23-50 
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injection, cAMP levels in Prnp mice increased to levels approaching 
those of BL6 mice. Each dot represents one mouse. d-f, cAMP also spiked 
in hearts of mice injected with FT23_59 but not FT) (d). *P < 0.05; 

**P <0.01. 6 d, FT 3-59 or FIS5 4) was injected intravenously into 
10-16-week-old Prnp“#/2 or BL6 mice (600 1g per animal, 20 min). 
cAMP levels in kidneys (e) and brain (f) showed no significant changes. 
Each dot represented an individual mouse; statistical significance was 
evaluated by unpaired Student’s t-test. Error bars show s.e.m. 
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Extended Data Table 1 | Sequences of synthetic peptides used in the present study. The collagen-4 homology domain necessary 
for cAMP induction is highlighted in yellow 


Name Sequence 


FT 3.50 KKRPKPGGWNTGGSRYPGQGSPGGNRYP 


FT23.50-HA | KKRPKPGGWNTGGSRYPGQGSPGGNRYPYPYDVPDYA 


FT34.69-HA PGQGSPGGNRYPPQGGTWGQPKGGGWGOQYPYDVPDYA 


FT 51-94-HA PQGGTWGQPHGGGWGQPHGGS WGQPHGGS WGQPHGGGW 


FT 33-199-HA PHGGGWGQGGGTHNQWNKPSKPKTNLKYPYDVPDYA 
FT 4-110-HA GGGTHNQWNKPSKPKTNLKHYPYDVPDYA 

FT 54-110 GGGTHNQWNKPSKPKTNLKH 

AlaQGSPG KKRPKPGGWNTGGSRYPGAAAAAGNRYP 

AlaKKRPK | AAAPAPGGWNTGGSRYPGQGSPGGNRYP 

CollV GPRGKPGVDGYNGSRGDPGYP 

Coll V-Mut AAAGAAGVDGYNGSRGDPGYP 


FT 3-34 KKRPKPGGWNTG 


SFT 23.34 AAAGAAGGWNTG 
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DWARF14 is a non-canonical hormone receptor 


for strigolactone 


Ruifeng Yao!*, Zhenhua Ming?*, Liming Yan**, Suhua Li, Fei Wang!, Sui Mal, Caiting Yu', Mai Yang!, Li Chen!, Linhai Chen’, 
Yuwen Li!, Chun Yan!, Di Miao!, Zhongyuan Sun!, Jianbin Yan!, Yuna Sun‘, Lei Wang®, Jinfang Chu”, Shilong Fan!, Wei He®, 
Haiteng Deng’, Fajun Nan’, Jiayang Li°, Zihe Rao**, Zhiyong Lou? & Daoxin Xie! 


Classical hormone receptors reversibly and non-covalently bind 
active hormone molecules, which are generated by biosynthetic 
enzymes, to trigger signal transduction. The a/@ hydrolase 
DWARF14 (D14), which hydrolyses the plant branching hormone 
strigolactone and interacts with the F-box protein D3/MAX2, 
is probably involved in strigolactone detection!-3. However, the 
active form of strigolactone has yet to be identified and it is unclear 
which protein directly binds the active form of strigolactone, and 
in which manner, to act as the genuine strigolactone receptor. 
Here we report the crystal structure of the strigolactone-induced 
AtD14-D3-ASK1 complex, reveal that Arabidopsis thaliana 
(At)D14 undergoes an open-to-closed state transition to trigger 
strigolactone signalling, and demonstrate that strigolactone is 
hydrolysed into a covalently linked intermediate molecule (CLIM) 
to initiate a conformational change of AtD14 to facilitate interaction 
with D3. Notably, analyses of a highly branched Arabidopsis 
mutant d14-5 show that the AtD14(G158E) mutant maintains 
enzyme activity to hydrolyse strigolactone, but fails to efficiently 
interact with D3/MAX2 and loses the ability to act as a receptor 
that triggers strigolactone signalling in planta. These findings 
uncover a mechanism underlying the allosteric activation of AtD14 
by strigolactone hydrolysis into CLIM, and define AtD14 as a non- 
canonical hormone receptor with dual functions to generate and 
sense the active form of strigolactone. 

The classical hormone receptors that have been reported so far 
reversibly bind their active hormone molecules, which are generated 
through sequential actions of biosynthetic enzymes’. In the plant 
kingdom, all known hormones non-covalently bind their recep- 
tors to initiate signalling, and dissociate from the receptors without 
alteration> !° (for further information, see Supplementary Information 
refs 43-61). Strigolactones (SL) are plant hormones that play a vital role 
in the control of plant branching'!"’, and also act as rhizospheric signals 
for communication with symbiotic fungi'* and parasitic plants'>""”. 
MORE AXILLARY GROWTH2 (MAX2) and DWARF14 (AtD14) in 
Arabidopsis'*°, their orthologues in rice (D3 and OsD14)*»”” and 
other species’!>-!”3, are essential signalling components to regulate 
the SL-repressed plant branching. D14 encodes an a/8 hydrolase 
that hydrolyses GR24 (a synthetic SL analogue) into an intermediate 
2,4,4,-trihydroxy-3-methyl-3-butenal (TMB)*4, the final products 
hydroxymethyl butenolide (D-OH)”* and tricyclic lactone 
(ABC-OH)'*4, all of which lose their SL biological activity’. 
MAX2/D3 encodes an F-box protein, which interacts with D14 ina 
SL-dependent manner!*5-?’ and recruits various repressors (such as 
rice D53 and Arabidopsis SMXL6/7/8) for 26S proteasome-mediated 
degradation*>””, The D14-D3/MAX2 complex”®, D14-D53/SMXLs 
complex”, D3/MAX2 protein”, particularly, D14 proteinh”!5-179 


were previously proposed to be involved in SL perception. However, 
the active form of SL has not been determined, and it remains unknown 
which protein directly binds the active form of SL, and in which 
manner, to act as the SL receptor that triggers SL signalling. 

To investigate the SL perception mechanism, we purified AtD14 and 
D3, which share high homology with MAX2 (Extended Data Fig. 1a) 
and rescue the max2 mutant (Extended Data Fig. 1b, c), assembled the 
GR24-induced complex of AtD14 and D3 (co-expressed with ASK1 
protein? 8) and solved the crystal structure of this complex (Fig. 1a, b 
and Extended Data Table 1). This structure presents a seahorse-shaped 
organization, in which AtD14 acts as the head part, the leucine-rich 
repeat domain of D3 (D3 LRR) forms the body, and the F-box motif of 
D3 (D3 F-box) together with ASK1 serve as the tail. AtD14 is attached 
to the C terminus of D3 LRR with its catalytic cavity facing to D3 LRR. 
Electron density maps (Extended Data Fig. 2a) suggest that a small 
molecule is sealed inside the closed catalytic cavity of AtD14 without 
contacting D3 (Fig. 1a): this small molecule fixes its two arms through 
stable contacts with the N** atom of H247 and the hydroxyl oxygen 
of S97 of AtD14. Consistent with these structural observations, GR24 
hydrolysis assays showed that the presence of D3 inhibited the AtD14- 
mediated GR24 hydrolysis to attenuate the release of D-OH (Extended 
Data Fig. 2b), suggesting that D3 binds AtD 14 to retain a hydrolytic 
D-ring-derived intermediate inside the closed catalytic cavity of AtD 14. 

To further identify the small molecule sealed inside AtD14, we 
collected the GR24-induced AtD14—D3-ASK1 complex using size- 
exclusion chromatography (Fig. 1b) to separate AtD 14 for tandem 
mass spectrometry (MS/MS) analysis. Peptide matching from MS/ 
MS spectra identified a modified peptide of AtD14 with a molecular 
weight corresponding to the C;5H5O2 modification covalently linked to 
H247 (Extended Data Fig. 3b), whereas such a modification was not 
detected on the hydrolytic mutants AtD14(S97A) or AtD14(H247A) 
(Extended Data Fig. 4c-e). All of the reported natural SLs contain the 
same enol ether bridge-linked D-ring as GR24 but have variable groups 
to replace the ABC ring*!*”°. The biologically active compounds, 
5-deoxystrigol (5DS, a native SL) and 4-Br debranone (4BD, an SL ana- 
logue) (Extended Data Fig. 3a), both effectively induced AtD14-D3 
complex formation (Extended Data Fig. 3c) and generated the same 
Cs5H;O, modification as GR24 on AtD 14 (Extended Data Fig. 3d, e). 
In contrast, the biologically inactive compounds'””° carba-GR24 and 
D-OH, without the ability to induce the AtD14-D3/MAX2 interaction, 
failed to generate the C;H5O2 modification on AtD14 (Extended Data 
Fig. 5). These results suggest that various biologically active SL mole- 
cules are hydrolysed into the same D-ring-derived intermediate sealed 
inside the D3-bound AtD14. In support of this, further MS/MS analysis 
using 4BD with the ?H3-labelled D-ring (7H3-4BD) identified a 7H3- 
labelled C;H502 modification on AtD 14 (Fig. 1c, dand Extended Data 
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Figure 1 | Overall structure of AtD14-D3-ASK1 in complex with an active 
SL intermediate. a, Overall structure of AtD14-D3-ASK1 in complex with 
a GR24 hydrolysis intermediate CLIM. AtD14, ASK1 and D3 are coloured as 
light blue, pale green and rainbow scheme (N to C terminus from blue to red), 
respectively. CLIM is shown as space-fill representation. b, Size-exclusion 
chromatography analysis (left panel) of the interaction between AtD14 and 
D3-ASK1; the elution volumes of the molecular weight markers are indicated 
on the top. SDS-PAGE analysis (right panel) of peak fractions from the left 


Fig. 6). Moreover, the same C;H50, modification was also detected 
in planta when plants were treated with GR24 (Extended Data Fig. 4a) 
or 5DS (Extended Data Fig. 4b). 

Together with the previously determined SL hydrolysis com- 
pounds!~*4 and the general principle of Ser-His—Asp catalytic triad, 
a combination of above-mentioned findings enable us to rationally 
deduce the chemical structure of this D-ring-derived small molecule 
CLIM in our AtD14—-D3-ASK1 crystal (Extended Data Fig. 2c, d and 
Supplementary Notes), suggesting that CLIM is generated by AtD14 
during SL hydrolysis and covalently sealed inside AtD 14 to act as the 
active SL molecule that initiates SL signalling. 

Notably, the architecture of the binding pocket for CLIM in our 
crystal markedly differs from that of GR24, TMB and D-OH in the 
previously reported D3-unbound OsD 14 crystals, which possess the 
similar overall structure as apo AtD14/OsD14 protein! +4, Compared 
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panel; M, molecular weight ruler (kDa). c, d, MS/MS spectra of a bivalently 
charged peptide (244-TEGhLPQLSAPAQLAQFLR-262) of AtD14 at mass- 
to-charge ratio (m/z) 1087.0736 or 1088.5828 corresponding to the mass of 
CsHsQ) or C5H*H30) modification of the peptide, which was isolated from 
AtD14 in the 4BD- (c) or 7H3-labelled 4BD-induced (d) AtD14—D3 complex 
with in-solution digestion method, respectively. Labelled peaks correspond 
to masses of y and b ions of the modified peptide. Lowercase ‘h indicates the 
modified H247. 


to the apo AtD14 (open state), AtD14 in the GR24-induced AtD14- 
D3 complex (closed state) has experienced significant conformational 
changes, referred to as the open-to-closed transition (Fig. 2). The open 
state contains an unclosed lid with four top helices (aT1-aT4), and 
a large open pocket (420 A?) (Fig. 2a-d and Extended Data Fig. 7a-f) 
compatible with binding large molecules (such as GR24). However, 
the closed state contains a collapsed lid with only three helices (aT1, 
aT3 and aT4), and a small closed pocket (80 A>) (Fig. 2e-h) suit- 
able for embedding small molecules (such as CLIM). During the 
open-to-closed state transition, wT2 rotates towards aT1 while aT3 
rotates towards «T4, coupled by the helix-to-coil transition of «T2 
and the coil-to-helix transition of the loop between aT1 and aT2, 
making the pocket smaller and closed to generate the flat tri-helix lid 
compatible for binding D3 (Fig. 2e-g). Notably, S97 and H247 stay 
almost in the original positions of the catalytic triad (S97-H247-D218), 
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Figure 2 | Allosteric activation of AtD14 for the open-to-closed 
conformational transition. a—h, Motif labelling and structural 
comparison between apo AtD14 (a, PDB code: 41H4) and AtD14-CLIM in 
D3-bound form (e), focusing on the shape (b and f), volume (c and g) and 
details (d and h) of the catalytic cavity. a-d Apo AtD14 in the open state; 
f-h, AtD14—CLIM in D3-bound form in the closed state. 


and constitute a hydrophobic cavity, together with a subset of 
residues (including F28, F126, F175, L179 and V194) (Fig. 2h), to 
seal CLIM inside AtD14. However, the Lg7-.z loop (the loop region 
between 87 and aE) containing D218 of the catalytic triad (Fig. 2a) 
moves away and becomes disordered in the D3-bound AtD14 
(Fig. 2e), which would be expected to disrupt the S97-H247-D218 
catalytic triad of AtD14 to prevent CLIM from further hydrolysis into 
final product D-OH in the D3-bound AtD14. It is possible that this 
destabilized Lg7_az in the D3-bound AtD14 is involved in interaction 
with other proteins such as the repressors D53/SMXLs that interact 
with AtD14/OsD 14 in a SL-dependent manner!*”>-?”, 

We further solved the crystal structure of AtD14-unbound D3-ASK1 
to make a structural comparison with the AtD14-bound D3-ASK1. 
Both structures contain the N-terminal tri-helical F-box motif bound 
to ASK1, and the C-terminal LRR domain (D3 LRR) with 19 LRRs 
packing in tandem and assembling into a solenoid with an overall 
horseshoe-shaped structure (Extended Data Fig. 8a—d), which is sim- 
ilar to TIRI-ASK1 and COI1-ASK1 complexes**. As a notable struc- 
tural feature of AtD14-bound D3, LRR15, LRR16, LRR17 and LRR19 
each have an unusual long loop between their a-helix and 6-strand 
(Extended Data Fig. 8a, b). Loop16, loop17 and loop19 are involved in 
the interaction with AtD 14 (Fig. 3a) and in turn stabilized by AtD 14, 
whereas loop15 is not directly involved in contacting AtD 14 although 
stabilized by AtD14 (Extended Data Fig. 8c, d). In contrast, most of 
LRR13 is disordered in both structures, implying that LRR13 (and 
loop15) are possibly involved in targeting another protein. 

Further structural analyses reveal a well-defined interface to mediate 
the AtD14-D3 interaction. The C-terminal LRRs of D3 extensively 
bind AtD14 in the closed state (on the lid domain as well as some 
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Figure 3 | Structural analyses of the AtD14-D3 interaction. 

a, An overview of the AtD14—D3 interface. Loop16, loop17 and loop19, 
which are involved in the AtD14—D3 interaction, are labelled. The minor 
and major parts of the interface in AtD14 are highlighted by sand and 
green, respectively. b-e, Structural interactions between AtD14 and D3. 
The structural elements of AtD14 and D3 are coloured light blue and 
salmon, respectively. Residues are labelled with a superscript “4 or 

P3 to indicate which protein they originate from. Dashed lines indicate 
hydrogen-bonding interactions, except for van der Waals forces between 
$608"? and V1644174, 


short helices and loops near the lid) to form an AtD14-D3 interface 
with an area of 3,013 A? (Fig. 3a). This interface has two separate 
parts (termed the minor interface and major interface, on the basis of 
their sizes). The minor interface contains an extensive intermolecular 
hydrogen bond network, involving four critical residues (T599?? and 
D606"? of loop 16, E1744?4 and R177“P4 of aT3) (Fig. 3b). This 
network is further stabilized by T602??-E245“'?4 hydrogen bonding 
and $608??-V 1644'P4 van der Waals contact. The major interface is 
further divided into two distinct areas: (1) a hydrophobic area (Fig. 3c), 
with predominant van der Waals interactions, consists of residues from 
D3-LRR (L609, R612”?, L644°°, P645”?, L649? and F669?*) and 
the AtD14 lid domain (A16044 to V164“"P4 and F1804?!4); (2) a 
hydrophilic area (Fig. 3d, e), with many hydrogen-bond interactions, 
consists of residues from two helices of D3 LRR (LRR18 and LRR19) 
and a AtD14 surface region that is covered by aT3 in the AtD14 
open state but exposed in the closed state. D3 LRR18 contributes to 
AtD14-D3 interaction through a tandem hydrogen bonding zipper 
(N1814*4_1766623-G574P4_£667>3-H668P3-C55AP 4 /Vy5gAtD 14) 

(Fig. 3d), whereas D3 LRR19 contributes to AtD14—D3 interaction 
through a cascade of intermolecular hydrogen bonds (E704°°— 
N11“? 4_R70223_-D524P4_R702P3_D3 1A 4_§3341D4_T¢699>3_R70203) 
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MAX2 for triggering SL signalling. a, b, The d14-5 mutant exhibits 

less sensitivity to GR24 inhibitory hypocotyl elongation (a) and displays 
significantly more primary branches (b). a, Representative hypocotyl 
phenotype (left panel) of Col-0, d14-5 and two independent transgenic 
lines d14-5::AtD14 line 1( L1) and d14-5::AtD 14 L2; and relative hypocotyl 
lengths (right panel) of the indicated seedlings. Scale bars, 5mm. The 
images are representative of 20 plants for each genotype, each treatment 
and each biological replicate. b, Representative branching phenotype (left 
panel) and quantitative analysis of primary rosette branches (right panel) 
of the indicated plants. Scale bars, 1 cm. The images are representative 

of 20 plants for each genotype and each biological replicate. Data in 

both a and b are means + s.d. (n =3 biological replicates of 20 plants 

per genotype); **P <0.01 (ANOVA). c, The branch elongation of the 
bottom bud of the excised stem segments is inhibited by GR24 in Col-0 
and max3-11, but not in max2-3 and d14-5. Data are means +s.d. (n=3 
biological replicates of 12 segments per genotype); NS, not significant; 
*P <0.05 (ANOVA). d, AtD14(G158E) fails to efficiently bind MAX2. 
Pull-down assays using recombinant Hiss-MAX2 and GST-AtD14 or 
GST-AtD14(G158E) in the absence or presence of GR24. Full blots are 
shown in Supplementary Fig. 1. e, AtD14(G158E) exhibits higher activity 
to hydrolyse (+-)-GR24 into D-OH. Released D-OH in each indicated 
reaction mixture were quantitated by LC-MS/MS. Data are means + s.d. 
(n=6). f, A simplified model of SL perception. AtD14 in its open state 
docks SL in the catalytic cavity, hydrolyses SL into a D-ring-derived 
molecule (CLIM), which is covalently sealed inside the catalytic centre of 
AtD14. AtD14 undergoes conformational change suitable for interaction 
with D3/MAX2-based SCF complex, which may in turn strengthen this 
conformational change, to trigger SL signal transduction. 
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(Fig. 3e). Key residues in both minor and major interfaces are 
largely conserved (Extended Data Figs la and 7g), and the amino 
acid substitution of the critical residues (such as F180 (ref. 3), 
P161, E174 and R177 of AtD14; and D606, L644 and R702 of D3) 
severely attenuates GR24-induced AtD14—D3 interaction (Extended 
Data Fig. 8e, f). Taken together, our structural analyses reveal a well- 
defined interface that mediates the AtD14—D3 interaction. 

Genetic analysis of an Arabidopsis mutant allele d14-5, with the 
Gly158Glu substitution in AtD14 (AtD14(G158E)), showed that 
d14-5 exhibited reduced sensitivity in GR24-inhibitory hypocotyl 
elongation (Fig. 4a) and axillary bud growth (Fig. 4c), and displayed 
increased primary branches (Fig. 4b), suggesting that SL signalling is 
severely impaired in d14-5. Structural analyses suggest that the G158 
residue participates in the construction of a 7-turn structure (A154— 
W155-V156-H157-G158-F159) (Extended Data Fig. 9d, e). This 
7-turn structure, with an intrinsic capacity to stop further extension 
of the aT1 C terminus during the open-to-closed transition of AtD14 
(Fig. 2a, e), contributes to stabilizing proper conformation of the AtD14 
lid that is essential for binding D3/MAX2 (Fig. 4f and Extended Data 
Fig. 9e). G158E substitution would be expected to impair the closed 
state of the AtD14 lid to attenuate interaction between AtD14(G158E) 
and D3/MAXz2. Indeed, AtD14(G158E) failed to efficiently interact 
with MAX2/D3 in the presence of GR24 (Fig. 4d and Extended 
Data Fig. 9b), suggesting that G158E substitution attenuates the 
GR24-induced interaction between AtD14 and D3/MAX2. Notably, 
AtD14(G158E) efficiently interacted with the signalling repressor 
SMXL6 in the presence of GR24 (Extended Data Fig. 9c), suggesting 
that AtD14(G158E) is able to undergo some conformational changes 
for binding SMXL6 in the presence of GR24. Notably, the SL hydrolysis 
assays showed that AtD14(G158E) maintains intact activity to mediate 
GR24 hydrolysis. In addition, AtD14(G158E) hydrolysed GR24 
faster than AtD14 (Fig. 4e), which is consistent with our structural 
analyses that G158E substitution would impair the closed state 
of the AtD14(G158E) lid to enable more efficient GR24 uptake 
and D-OH release. These results demonstrate that the G158E 
substitution in AtD14(G158E) attenuates perception function to 
impair SL signalling, but maintain the ability to mediate SL hydrolysis, 
providing an independent line of evidence that AtD14 has dual 
functions as an enzyme that hydrolyses SL and a receptor that initiates 
SL signalling. 

Previous crystallographic studies found that, similar to apo D14, 
all of the D3-unbound D14 in complex with GR24, TMB or D-OH 
are in the open state without overall conformational shift!>7+°°, Our 
work has uncovered a closed state of AtD 14 in the SL-induced AtD14- 
D3 complex. This closed state is probably attributable to D3 binding: 
AtD14 hydrolyses SLs into the D-ring-derived intermediate CLIM that 
covalently binds inside the catalytic pocket of AtD14, and undergoes 
conformational transition for interaction with D3, which may in turn 
strengthen this conformational change, to trigger SL signalling (Fig. 4f). 
Distinct from all known active phytohormone molecules that are 
first generated by biosynthesis enzymes and then non-covalently and 
reversibly bound by their receptors*!°, CLIM represents a novel active 
hormone molecule that is generated by AtD14-mediated SL hydrolysis 
and sealed inside AtD 14 through covalent links with H247 and prob- 
ably S97, although direct evidence is needed to further validate the 
S97 linkage. Previous studies*>-*” have not yet elucidated the sequence 
of interactions among D14, D3/MAX2 and D53/SMXLs. Further 
structural and functional investigations on the D14-D3/MAX2-D53/ 
SMXLs complex assembly will provide deeper mechanistic insights into 
the SL signalling pathway. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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Extended Data Figure 1 | D3 orthologues among different plant species 
and complementation of Arabidopsis max2 by rice D3. a, Secondary 
structure elements of D3 in the crystal structure of D3-AtD14 complex 
are shown on top of Oryza sativa D3 sequence. Identical and conserved 
residues are highlighted by red and white grounds, respectively. The 19 
LRRs are indicated by solid magenta rectangles. The GenBank accession 
numbers from top to bottom: Oryza sativa D3 (BAD69288), Arabidopsis 
thaliana MAX2 (NP565979) and Petunia hybrida MAX2 A (AEB97384). 
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b, Branching phenotypes of seven-week-old Col-0, max2-3 and two 
independent lines (L1 and L2) of max2-3 plants expressing Flag-tagged 
rice D3 (max2::D3). Scale bar, 1 cm. The images are representative of 20 
plants for each genotype. c, The Flag—D3 protein in corresponding plants 
(b) was detected by anti-Flag antibody (upper panel). Staining of Rubisco 
large subunit (LSU) served as a loading control (lower panel). Full blots are 
shown in Supplementary Fig. 1. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | A proposed mechanism of AtD14-mediated 
SL hydrolysis. a, CLIM forms stable contacts with $97 and H247 in the 
catalytic centre of AtD14. CLIM, the nearby contacting residues and two 
catalytic residues (S97 and H247) are represented as sticks. The 2F, — Fy 
omit map (black mesh) for CLIM is contoured at lo and the 2F, — F, 
electron density maps (red mesh) for CLIM, $97 and H247 are contoured 
at lo (left panel) and 1.50 (right panel), respectively. b, D3 inhibits the 
release of D-OH during the AtD14-mediated GR24 hydrolysis. Released 
D-OH in each indicated reaction mixture after 60 min incubation at 
37°C were quantitated by LC-MS/MS. 501M GR24, 81M AtD14, 

and/or 151M D3 were used. Data are means + s.d. (n = 6). c, Proposed 
mechanism of AtD14-mediated GR24 (compound 1) hydrolysis. The 
reaction begins with the well-accepted nucleophilic attack by $97 in the 
catalytic triad'***°. Decomposition of the tetrahedral intermediate I 
resulted in the enolate form of ABC-OH (2) (which upon protonation 
gives compound 2) and the $97-bound moiety 3. The alternative direct 
hydrolysis of GR24 (1) is not supported by the fact that AtD14 S97 mutant 


is catalytically inactive’. Likewise, direct hydrolysis of the ester bond in 
S97-bound moiety 3 is an unlikely event. As the electron density shape 
(Extended Data Fig. 2a) and the C;H;O2 modification (Fig. 1c, d and 
Extended Data Figs 3 and 4) support covalent bond between H247 and 
the C5 unit moiety, the transformation from $97-bound moiety 3 to 
D-OH (6) proceeds via the following cascade: (i) The N® atom of H247 
attacks the aldehyde carbonyl of the S97-bound moiety 3 to form the 
hemiaminal 4, referred to as the covalently linked intermediate molecule 
(CLIM). The structure of the H247- and S97-linked 4 is consistent with 
the electron density shape (Extended Data Fig. 2a). (ii) Lactonization of 
the hemiaminal 4, via a second tetrahedron intermediate (not shown), 
gives free S97 and H247-bound lactone 5. The structure of 5 is consistent 
with the C;H;02 modification of H247 detected by MS/MS (Fig. 1c, d and 
Extended Data Figs 3 and 4). (iii) Finally, the hydrolysis of H247-linked C5 
unit will produce D-OH (6). A more detailed chemistry analysis is shown 
in the Supplementary Notes. d, Proposed mechanism of AtD14-mediated 
4BD hydrolysis, which is similar to that of GR24. 
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Extended Data Figure 3 | Various SL molecules generated the same 
C;H50, modification of AtD14. a, Chemical structures of rac-GR24 
(racemic GR24), 4BD and 5DS with the conserved D-ring. b, GR24- 
induced C;H;O, modification of H247 of AtD14 was identified. MS/MS 
spectra of a triply charged peptide (244-TEGhLPQLSAPAQLAQFLR-262) 
of AtD14 at mass-to-charge ratio (m/z) 725.0528 corresponding to the 
mass of the C;H;O, modification of H247. The AtD14 gel bands of the 
SEC-separated GR24-induced AtD14—D3 complex were excised for 
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trypsin digestion and MS/MS analysis. Labelled peaks correspond to 
masses of y and b ions of the modified peptide. Lowercase ‘Wy indicates 
the modification on H247 by a GR24 hydrolysis intermediate with the 
molecular formula C;H503. c, Size-exclusion chromatography analysis 
of the 5DS- or 4BD-induced AtD14-D3 interaction using the same 
method as described in Fig. 1b. d, e, 4BD (d) and 5DS (e) induced the 
same C;H;O, modification of H247, which was detected using the in-gel 
digestion method as described in b. 
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Extended Data Figure 4 | SLs generated the C;H;O02 modification of 
AtD14 in planta. a, b, The existence of CLIM in vivo was suggested by 
successful identification of the same C;H;O2 modification on the 
AtD14-Flag purified from plants pre-treated with GR24 (a) or 5DS (b). 
The MS/MS analysis was performed with the same method as mentioned 
in Extended Data Fig. 3b. c, No modification was observed on the 247th 
amino acid of AtD14(S97A). The MS/MS spectra of a triply charged 
peptide 244-TEGHLPQLSAPAQLAQFLR-262 of AtD14(S97A) at 
mass-to-charge ratio (m/z) 693.0458 corresponding to the mass of 
unmodified peptide, which were digested from AtD14(S97A) band 
separated by SDS-PAGE of the mixture containing AtD14(S97A), GR24 
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and D3-ASK1. d, No modification was observed on the 247th amino 

acid of AtD14(H247A). The MS/MS spectra of a triply charged peptide 
244-TEGALPQLSAPAQLAQELR-262 of AtD14(H247A) at m/z 671.0434 
corresponding to the mass of unmodified peptide, which were digested 
from AtD14(H247A) band separated by SDS-PAGE of the mixture 
containing AtD14(H247A), GR24 and D3-ASK1. e, Size-exclusion 
chromatography analyses showed that the alanine substitution on S97 

or H247 of AtD 14 disrupted the GR24-induced interaction between the 
AtD14 and D3-ASK1. The size-exclusion chromatography and SDS-PAGE 
are performed similarly as described in Fig. 1b. 
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Extended Data Figure 5 | Carba-GR24 and D-OH could neither 
induce AtD14-D3 interaction nor generate the C;H;O2 modification 
on AtD14. a, Pull-down assays using recombinant Hiss~-MAX2 and 
GST-AtD14 with indicated concentration of carba-GR24 and GR24. 

Full blots are shown in Supplementary Fig. 1. b, The MS/MS spectra of a 
triply charged peptide 244-TEGHLPQLSAPAQLAQFLR-262 of AtD14 

at m/z of 693.0450 corresponding to the mass of an unmodified peptide, 
which was digested from the AtD14 bands separated by SDS-PAGE of 
the mixture containing AtD14, carba-GR24 and D3-ASK1. c, Chemically 
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synthesized D-OH was unable to induce AtD14-D3 interaction. AtD14 
was incubated with D3-ASK1 in the presence of 8mM D-OH at 25°C 

for one hour, followed by size-exclusion chromatography analysis as 
described in Fig. 1b. d, The MS/MS spectra of a triply charged peptide 
244-TEGHLPQLSAPAQLAQFLR-262 of AtD14 at m/z of 693.0454 
corresponding to the mass of an unmodified peptide, which was digested 
from the AtD14 bands separated by SDS-PAGE of the mixture containing 
AtD14, D-OH and D3-ASK1. 
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Extended Data Figure 6 | The MS spectra and chromatograms of the 
C;H;O2-modified peptide of AtD14. a, The MS spectra of a bivalently 
charged peptide (244-TEGhLPQLSAPAQLAQFLR-262) of AtD14 at 

m/z of 1087.0736 or 1088.5828 corresponding to the mass of C;H;O, or 
CsH,”H30, modification of the peptide, which was isolated from AtD14 
in the 4BD or 7H3-labelled 4BD-induced AtD14-D3 complex with 
in-solution digestion method, respectively. b, The chromatograms of the 
unmodified (top panel), CsH50- (middle panel) or C5sH2’H303-modified 
(bottom panel) peptide of AtD14 (244-TEGHLPQLSAPAQLAQFLR-262) 
in the LC-MS/MS analyses. The mass spectra of the C;H5O2- or 
CsH,?H302-modified peptide are shown in a. 
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Extended Data Figure 7 | Comparison of different D14 structures. D3-AtD14 complex are displayed on top of the Arabidopsis thaliana D14 
a-f, Related to Fig. 2, the characteristic features of two previously reported sequence. Identical and conserved residues are highlighted by red and 
D3-unbound OsD14 structures, focusing on the shape (a, d), volume white grounds, respectively. The three catalytic residues, $97, D218 and 
(b, e) and details (c, f) of the catalytic centre. (OsD14-TMB, PDB code: H247, are indicated by magenta dots. The GenBank accession numbers for 
41HA; OsD14-D-OH, PDB code: 3WIO). g, Sequence alignment and sequences from top to bottom: Arabidopsis thaliana D14 (Q9SQR3), Oryza 
structural annotation of D14 orthologues among different species. sativa D14 (Q10QAS) and Petunia hybrida DAD2 (AFR68698). 


Secondary structure elements of AtD 14 in the crystal structure of 
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Extended Data Figure 8 | Structures of D3-ASK1 and mutagenesis 
analyses of the AtD14-D3 interaction. a, b, Overall structure of 
AtD14-bound D3-ASK1 in top view (a) and side view (b). ASK1 and 

D3 are coloured as pale green and rainbow scheme, respectively. LRRs 

are labelled. The unusually long loops in LRR16, LRR17 and LRR19 
(denoted as loop 16, loop17 and loop19) are indicated. The missing part 
of LRR13 is denoted as a dashed line. c, d, Structural comparison between 
AtD14-bound D3-ASK1 (magenta) and the apo D3-ASK1 (pale green), 
with their orientations related to a and b. The indicated loop15, loop16, 
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loop17 and loop19 undergo significant conformational changes during 
AtD14 binding. e, f, Amino acid substitution of the central residues 
(P1614P'4, B1744P4 R1774YM 60603, L644P? and R702°) on the 
interface attenuate interaction between AtD14 and D3-ASK1. GR24 was 
supplemented in all the reactions. The GR24-induced interactions between 
D3-ASK1 and wild-type AtD14 or unrelated mutant AtD14(S58A) were 
used as the control. The size-exclusion chromatography and SDS-PAGE 
were performed as described in Fig. 1b. 
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Extended Data Figure 9 | Gly158 plays an important role in the 
SL-induced open-to-closed transition of AtD14 lid. a, AtD14-Flag 
protein levels in two independent lines (L1 and L2) of the d14-5 plants 
with transgenic expression of Flag-tagged AtD14 (d14-5::AtD14, 
mentioned in Fig. 4b) (upper panel). Staining of Rubisco large subunit 
(LSU) served as a loading control (lower panel). Full blots are shown in 
Supplementary Fig. 1. b, AtD14(G158E) was incubated with D3-ASK1 in 
the presence of GR24, followed by size-exclusion chromatography analysis 
as described in Fig. 1b. c, AtD14(G158E) maintains the SL-induced 
interaction with SMXL6. Pull-down assays using recombinant SMXL6- 
Flag and GST-AtD14 or GST-AtD14(G158E) in the absence or presence 
of 201M GR24. Full blots are shown in Supplementary Fig. 1. 

d, G158“>!4 participates in the construction of a x-turn structure and 
the formation of a hydrogen bond with R652’. The x-turn structure, and 
structural elements of AtD14 and D3 are coloured as blue, light blue and 
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salmon, respectively. The n-turn residues include A154, W155, V156, 
H157, G158 and F159. e, During the open (apo AtD14, cyan) to closed 
(D3-bound AtD14, green) state transition of AtD14, the aT2 and aT3 
helices of the lid domain experience pronounced structural and positional 
alterations (upper panel). In the open state (lower-left panel), the N 
terminus of aT2 and portions from aT1 and its immediate preceding loop 
are more flexible than other parts of the AtD14 lid domain, as judged by 
the higher temperature factors (B-factors) of these regions. In the closed 
state (lower-right panel), the 7-turn structure, with an intrinsic capacity to 
stop further extension of the wT1 C terminus during the open-to-closed 
transition of AtD14, contributes to stabilizing proper conformation of the 
AtD14 lid in the AtD14-D3 complex. Therefore, substitution of Gly158 
with a good helix-forming residue Glu would expectedly impair normal 
conformational changes of the lid which was observed in the AtD14-D3 
complex. 
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Extended Data Table 1 | Data collection and refinement statistics 


D14-D3-ASK1 complex 


Se-SAD peak Native ciohiamei 

PDB codes (*) 5HZG 5HYW 
Data collection statistics 
Cell parameters 

a (A) 107.1 106.4 79.7 

b (A) 173.2 173.0 79.7 

c (A) 187.4 186.8 327.3 

a, B, y(°) 90, 90, 90 90, 90, 90 90, 90, 90 
Space group P212121 P2132121 P43 
Wavelength used (A) 0.9792 0.9792 0.9792 
Resolution (A) 50.0-3.6 (3.7-3.6)° 50.0-3.3 (3.4-3.3) 50.0-3.0 (3.05-3.0) 
No. of all reflections 594,727 (41,118) 231,240 (50,269) 141,268 (7,085) 
No. of unique reflections 58,980 (4,041) 11,666 (3,153) 38,298 (1,969) 
Completeness (%) 100.0 (100.0) 99.9 (99.9) 94.1 (95.5) 
Average //o(/) 31.6 (6.5) 12.6 (3.1) 12.0 (2.8) 
Rmerge® (%) 15.2 (64.6) 11.3 (50.3) 14.0 (54.6) 
Refinement statistics 
No. of reflections used (o(F) > 0) 49,674 38,237 
Rwork® (%) 24.6 23.5 
Rrree® (%) 31.6 S12 
r.m.s.d. bond distance (A) 0.011 0.011 
r.m.s.d. bond angle (°) 1.781 1.671 
Average B-value (A?) 89.0 62.0 
No. of atoms 15,676 10,174 
Ramachandran plot 

Res. in favored regions (%) 100.0 100.0 

Res. in outlier regions (%) 0.0 0.0 


*Rmerge = VaLilln,i—!n|/ZnUilh,i, where /;, is the mean intensity of the i observations of symmetry related reflections of h. 
Rwork = d(||Fp(obs)| —|Fp(calc)||)/|F,(obs)|; Riree is an R factor for a pre-selected subset (5%) of reflections that was not included in refinement. F,, structure factor of protein. 
°Numbers in parentheses are corresponding values for the highest resolution shell. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


doi:10.1038/nature18952 


Vaccine protection against Zika virus from Brazil 
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Michael Boyd!, David Ng’ang’a!, Marinela Kirilova!, Ramya Nityanandam!, Noe B. Mercado!, Zhenfeng Li!, Edward T. Moseley!, 
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Lori F. Maxfield!, Rafael A. De La Barrera’, Richard G. Jarman*, Kenneth H. Eckels*, Nelson L. Michael*, Stephen J. Thomas? & 


Dan H. Barouch!4 


Zika virus (ZIKV) is a flavivirus that is responsible for the current 
epidemic in Brazil and the Americas’. ZIKV has been causally 
associated with fetal microcephaly, intrauterine growth restriction, 
and other birth defects in both humans** and mice®""'. The rapid 
development of a safe and effective ZIKV vaccine is a global 
health priority’, but very little is currently known about ZIKV 
immunology and mechanisms of immune protection. Here we 
show that a single immunization with a plasmid DNA vaccine or a 
purified inactivated virus vaccine provides complete protection in 
susceptible mice against challenge with a strain of ZIKV involved 
in the outbreak in northeast Brazil. This ZIKV strain has recently 
been shown to cross the placenta and to induce fetal microcephaly 
and other congenital malformations in mice!!. We produced DNA 
vaccines expressing ZIKV pre-membrane and envelope (prM-Env), 
as well as a series of deletion mutants. The prM-Env DNA vaccine, 
but not the deletion mutants, afforded complete protection against 
ZIKV, as measured by absence of detectable viraemia following 
challenge, and protective efficacy correlated with Env-specific 
antibody titers. Adoptive transfer of purified IgG from vaccinated 
mice conferred passive protection, and depletion of CD4 and CD8 
T lymphocytes in vaccinated mice did not abrogate this protection. 
These data demonstrate that protection against ZIKV challenge 
can be achieved by single-shot subunit and inactivated virus 
vaccines in mice and that Env-specific antibody titers represent key 
immunologic correlates of protection. Our findings suggest that the 
development of a ZIKV vaccine for humans is likely to be achievable. 
The World Health Organization declared the clusters of microcephaly 
and neurological disorders and their association with ZIKV infection 
to be a global public health emergency on February 1, 2016. ZIKV is 
believed to cause neuropathology in developing fetuses by crossing 
the placenta and targeting cortical neural progenitor cells”"“, leading 
to impaired neurogenesis and resulting in microcephaly and other 
congenital malformations. ZIKV has also been associated with neuro- 
logic conditions in adults, such as Guillain-Barré syndrome’. 
Vaccines have been developed for other flaviviruses, including yellow 
fever virus, Japanese encephalitis virus, tick-borne encephalitis virus, 
and dengue viruses, but no vaccine currently exists for ZIKV. To 
develop preclinical challenge models for candidate ZIKV vaccines, 
we obtained low-passage ZIKV isolates from northeast Brazil (Brazil/ 
ZKV2015; University of S40 Paulo) 1 and Puerto Rico (PRVABC59; US 
Centers for Disease Control and Prevention) (Extended Data Fig. 1). We 
expanded these viruses in Vero cells to generate preclinical challenge 
stocks, which we termed ZIKV-BR and ZIKV-PR, respectively. These 
ZIKV strains are part of the Asian ZIKV lineage'® and differ from each 
other by five amino acids in the polyprotein (Extended Data Fig. 2). 
The Brazil/ZKV2015 strain has also recently been reported to reca- 
pitulate key clinical manifestations, including fetal microcephaly and 
intrauterine growth restriction, in wild-type SJL mice''. Similarly, the 


related French Polynesian H/PF/2013 strain has been shown to induce 
placental damage and fetal demise in Ifnar~’~ C57BL/6 mice as well 
as in wild-type C57BL/6 mice following IFN-o receptor blockade’. 

We designed ZIKV prM-Env immunogens based on the Brazil 
BeH815744 strain (Extended Data Fig. 2) and optimized them for 
increased antigen expression. We also designed deletion mutants 
lacking prM and/or lacking the transmembrane region (ATM) or the 
full stem (Astem) of Env (Fig. 1a). Plasmid DNA vaccines encoding 
these antigens were produced, and transgene expression was veri- 
fied by western blot (Fig. 1b). To assess the immunogenicity of these 
vaccines, groups of Balb/c mice (n= 5-10 per group) received a single 
immunization of 501g of each DNA vaccine by the intramuscular (i.m.) 
route at week 0. Env-specific antibody responses were evaluated at week 
3 by ELISA. The prM-Env DNA vaccine elicited higher Env-specific 
antibody titers than did the Env DNA vaccine and all of the ATM 
and Astem deletion mutants (Fig. 1c), indicating the importance of 
including prM as well as the full-length Env sequence. No prM-specific 
antibody responses were detected (Extended Data Fig. 3). The prM- 
Env DNA vaccine also induced ZIKV-specific neutralizing antibodies 
after a single immunization (Table 1), as measured by a virus-specific 
microneutralization assay)”. In addition, the prM-Env DNA vaccine 
induced Env-specific CD8* and CD4* T-lymphocyte responses, as 
assessed by IFN ELISPOT and multiparameter intracellular cytokine 
staining assays (Fig. 1d, e). 

To assess the protective efficacy of these DNA vaccines against ZIKV 
challenge, we infected vaccinated or sham control Balb/c mice at week 4 
by the intravenous (i.v.) route with 10° viral particles (10° plaque-forming 
units (PFU)) of ZIKV-BR or ZIKV-PR. Viral loads following ZIKV 
challenge were quantitated by RT-PCR". Sham-vaccinated mice inoc- 
ulated with ZIKV-BR developed approximately 6 days of detectable 
viraemia with a mean peak viral load of 5.42 log copies per ml (range 
4.55-6.57 log copies per ml; n= 10) on day 3 after challenge (Fig. 2a). 
In contrast, a single immunization with the prM-Env DNA vaccine 
provided complete protection against ZIKV-BR challenge with no 
detectable viraemia (<100 copies per ml) at any time point (n= 10). 
Complete protection was also observed when vaccinated mice were 
challenged at week 8 (data not shown). The prM-Env DNA vaccine 
also afforded complete protection against ZIKV-PR challenge (n=5). 
ZIKV-PR replicated to slightly lower levels (mean peak viral load 
4.96 log copies per ml; range 4.80-5.33 log copies per ml; n =5) than 
did ZIKV-BR in sham controls. In contrast with the prpM-Env DNA 
vaccine, the DNA vaccines lacking prM as well as the ATM and Astem 
deletion mutants did not provide complete protection against ZIKV-BR 
challenge, although viral loads were still reduced in these animals as 
compared with sham controls (Fig. 2b). 

The varying degrees of protection obtained with this set of DNA 
vaccines allowed for an analysis of immune correlates of protection. 
Protective efficacy correlated with Env-specific binding antibody titers 
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Figure 1 | Construction and immunogenicity of DNA vaccines. a, Schema 
of ZIKV prM-Env immunogens and deletion mutants. b, Western blot of 
transgene expression from (1) prM-Env, (2) prM-Env(ATM), (3) prM- 
Env(Astem), (4) Env, (5) Env(ATM), (6) Env(Astem), and (7) sham DNA 
vaccines transfected in 293T cells. Balb/c mice (n=5 per group) received a 


(P=0.0005 comparing protected versus infected animals; Fig. 2c) 
as well as ZIKV-specific neutralizing antibody titers >10 (Table 1). 
In addition, peak viral loads on day 3 were inversely correlated with 
antibody titers (P< 0.0001, R? = 0.72; Fig. 2d). These data suggest that 
Env-specific antibodies were critical for the protective efficacy of DNA 
vaccines against ZIKV-BR challenge. Mice that received two immuniza- 
tions with the prM-Env DNA vaccine at week 0 and week 4 developed 
high neutralizing antibody titers of 1,022 at week 8 (Table 1) and were 
also protected against ZIKV-BR challenge (data not shown). 

The prM-Env DNA vaccine also provided complete protection 
against ZIKV-BR challenge in SJL mice (Extended Data Fig. 4) and 
against both ZIKV-BR and ZIKV-PR challenge in C57BL/6 mice 
(Extended Data Figs 5 and 6). ZIKV-BR replicated efficiently in SJL 
mice, consistent with a previous study", although at slightly lower 
levels (mean peak viral load 4.70 log copies per ml; range 3.50-5.92 log 
copies per ml; n=5) than in Balb/c mice (Fig. 2a). In contrast, both 
ZIKV-BR and ZIKV-PR replicated poorly in C57BL/6 mice (Extended 
Data Fig. 5), also consistent with previous reports, potentially as a result 
of robust IFN-a-mediated innate immune restriction in this strain of 
mice!01b19.20. 

To investigate the immunological mechanism of protection against 
ZIKV-BR challenge, we purified IgG from serum from Balb/c mice 
vaccinated with prM-Env DNA. Passive infusion of varying quantities of 
purified IgG by the iv. route resulted in median Env-specific log serum 
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single immunization with 501g of these DNA vaccines by the i.m. route. 
c, Humoral immune responses were assessed at week 3 following 
vaccination by Env-specific ELISA. Red bars reflect medians. d, e, Cellular 
immune responses were assessed by IFN ELISPOT assays (d) and multi- 
parameter intracellular cytokine staining assays (e). Error bars reflect s.e.m. 


antibody titers of 2.82 (high), 2.35 (mid) and 1.87 (low) in recipient 
mice following adoptive transfer (Fig. 3a). All recipient mice with log 
serum antibody titers of 2.35 or higher were protected against ZIKV-BR 
challenge (Fig. 3b, c), demonstrating that protection can be mediated by 
vaccine-elicited IgG alone and confirming that the magnitude of Env- 
specific antibody titers correlates with protective efficacy (P < 0.0001, 
Fig. 3b). In contrast, only 1 out of 5 recipient mice that received low 
levels of Env-specific IgG were protected, although they still exhibited 
reduced viral loads compared with sham controls (Extended Data 
Fig. 7). These data define the minimum threshold of Env-specific anti- 
body titers required for protection in this model. 

We next depleted CD4* and/or CD8* T lymphocytes in prM-Env- 
vaccinated mice on day —2 and day —1 before challenge (>99.9% 
efficiency; Extended Data Fig. 8). Depletion of these T-lymphocyte subsets 
did not detectably abrogate the protective efficacy of the prM-Env 
DNA vaccine against ZIKV-BR challenge (Fig. 3d). These data indi- 
cate that Env-specific T-lymphocyte responses were not required 
for protection in this model, although these findings do not exclude 
the possibility that ZIKV-specific cellular immune responses may be 
beneficial in other settings. 

To extend these observations to a vaccine platform that has histori- 
cally provided clinical efficacy against other flaviviruses, we explored 
the immunogenicity and protective efficacy of a ZIKV purified inac- 
tivated virus (PIV) vaccine derived from the Puerto Rico PRVABC59 
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Figure 2 | Protective efficacy of DNA vaccines. a, Balb/c mice (n=5 or 
10 per group) received a single immunization by the i.m. route with 50 jg 
prM-Env DNA vaccine or a sham vaccine and were challenged at week 4 
by the iv. route with 10° viral particles (10? PFU) ZIKV-BR or ZIKV-PR. 
Serum viral loads are shown. b, Mice (n =5 per group) received a single 
immunization with 50 1g of various DNA vaccines and were challenged 
with ZIKV-BR. c, d, Correlates of protective efficacy (c) and day 3 viral 
loads (d) are shown. Red bars reflect medians. P values and R? values 
reflect t-tests and Spearman rank-correlation tests. 


strain. Groups of Balb/c mice (n=5 per group) received a single 
immunization of 1 j1g of the PIV vaccine with alum or alum alone by 
the im. or subcutaneous (s.c.) routes. Antibody titers were higher in 
the group that received the PIV vaccine by the i.m. route rather than 


Table 1 | ZIKV-specific neutralizing antibody titers 


Vaccine ZIKV MN50O titer 
DNA prM-Env 22 
DNA prM-Env(ATM) <10 
DNA prM-Env(Astem) <10 
DNA Env <10 
DNA Env(ATM) <10 
DNA Env(Astem) <10 
DNA prM-Env + boost 1,022 
PIV + alum i.m. 15 
PIV+alum s.c. 15 
Sham-+alum i.m. <10 
Sham-+alum s.c. <10 
Anti-flavivirus antibody 232 


Balb/c mice received a single immunization with 50,.g of various DNA vaccines (Figs 1, 2) or lpg 


purified inactivated virus (PIV) vaccines with alum (Fig. 4), and pooled serum was assessed for 
ZIKV-specific neutralizing antibodies at week 4. 50% microneutralization (MN50) titers are shown. 
Also shown are MNSO titers in serum from mice following two immunizations with DNA-prM-Env 
(boost) and an anti-flavivirus human polyclonal antibody. 
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Figure 3 | Mechanistic studies. a, Env-specific serum antibody titers in 
recipient Balb/c mice (n=5 per group) following adoptive transfer of 
varying amounts (high, mid, low) of IgG purified from serum from mice 
vaccinated with prM-Env DNA or naive mice (sham). b, Correlates of 
protective efficacy. c, Serum viral loads in mice that received adoptive 
transfer of purified IgG from vaccinated mice and were challenged with 
ZIKV-BR. d, Serum viral loads in prpM-Env-DNA-vaccinated mice that 
were depleted of CD4* and/or CD8* T lymphocytes before challenge with 
ZIKV-BR. Red bars reflect medians. P values reflect t-tests. 


by the s.c. route, as compared by ELISA (Fig. 4a). The PIV vaccine 
by both routes also induced ZIKV-specific neutralizing antibodies 
after a single immunization (Table 1). At week 4, all mice were iv. 
challenged with ZIKV-BR as described above. Complete protection 
was observed in the group that received the PIV vaccine by the i.m. 
route (Fig. 4b, c). Two mice that received the PIV vaccine by the 
s.c. route showed brief low levels of viraemia (Fig. 4c), potentially 
consistent with the lower Env-specific binding antibody titers in this 
group (Fig. 4b). 

Our data demonstrate that a single immunization with a DNA 
vaccine or a PIV vaccine provided complete protection against paren- 
teral ZIKV challenge in mice. The prM-Env DNA vaccine afforded 
protection in three strains of mice and against both ZIKV-BR and 
ZIKV-PR challenges, suggesting the generalizability of these observa- 
tions. Protective efficacy was mediated by vaccine-elicited Env-specific 
antibodies, as evidenced by (1) statistical analyses of immune correlates 
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Figure 4 | Immunogenicity and protective efficacy of the PIV vaccine. 
Balb/c mice (n =5 per group) received a single immunization by the i.m. 
or s.c. route with 11g PIV vaccine with alum, or alum alone, and were 
challenged at week 4 by the i.v. route with 10° viral particles (10? PFU) 
ZIKV-BR. a, Humoral immune responses were assessed at week 3 
following vaccination by Env-specific ELISA. b, Correlates of protective 
efficacy. c, Serum viral loads are shown following ZIKV-BR challenge. Red 
bars reflect medians. P values reflect t-tests. 


of protection (Figs 2c, d, 4b), (2) adoptive transfer studies with purified 
IgG from vaccinated mice (Fig. 3a-c), and (3) T-lymphocyte depletion 
studies in vaccinated mice (Fig. 3d). The adoptive transfer studies also 
defined the threshold of Env-specific antibody titers required for pro- 
tection in this model. 

It is difficult to extrapolate directly the results from these vaccine 
studies in mice to potential clinical efficacy in humans. Nevertheless, 
the robust protection observed in the present studies and the clear 
immune correlates of protection suggest a path forward for ZIKV 
vaccine development in humans. Of note, similar antibody-based 
correlates of protection, including neutralizing antibody titers >10, 
have been reported for other flavivirus vaccines, including yellow 
fever virus, tick-borne encephalitis virus, and Japanese encephalitis 
virus?!-*?, Moreover, the ZIKV-BR challenge isolate used in the pres- 
ent study has been shown in wild-type SJL mice to recapitulate certain 
key clinical findings of ZIKV infection in humans, including fetal 
microcephaly and intrauterine growth retardation'!. ZIKV-BR did 
not lead to a fatal outcome in wild-type Balb/c and SJL mice, as has 
been observed in Ifnar~’~ C57BL/6 mice!®'”°, but the magnitude 
and duration of viraemia in Balb/c and SJL mice appear compara- 
ble with that in humans’, suggesting the potential relevance of this 
model. It is notable that ZIKV-BR replicated efficiently in wild-type 
Balb/c and SJL mice (Fig. 2a, Extended Data Fig. 4), but replicated 
poorly in wild-type C57BL/6 mice (Extended Data Fig. 5), which is 
consistent with previous observations!™"! and indicates important 
strain-specific differences for ZIKV infectivity. Further investiga- 
tion into the immunologic mechanisms underlying these differences 
may lead to insights into innate immune control of ZIKV. Moreover, 
further characterization of the susceptible Balb/c and SJL murine 
models may facilitate future studies of ZIKV pathogenesis and the 
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development of antiviral interventions. Future studies will also need 
to address the potential relevance of cross-reactive antibodies against 
dengue virus and other flaviviruses on ZIKV vaccine immunogenicity 
and protective efficacy. 

The epidemiology of the current ZIKV outbreak!” and the clinical 
consequences for fetuses in pregnant women who become infected?* 
necessitate the urgent development of a ZIKV vaccine. Our data 
demonstrate that complete protection against ZIKV challenge was 
reliably and robustly achieved with both DNA vaccines and purified 
inactivated virus vaccines in susceptible mice. These vaccine platforms 
have previously been used at comparable doses to develop vaccines for 
other flaviviruses, including West Nile virus?*?°, dengue viruses?®?7, 
tick-borne encephalitis virus”*”°, and Japanese encephalitis virus*”, 
and may offer safety advantages over live attenuated and replicating 
flavivirus vaccines, particularly for pregnant women. Moreover, the 
magnitude of Env-specific antibody titers that provide complete 
protection against ZIKV challenge in mice should be readily achievable 
by DNA vaccines and purified inactivated virus vaccines in humans. 
Taken together, our findings provide substantial optimism that the 
development of a safe and effective ZIKV vaccine for humans will 
probably be feasible. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


Animals. Balb/c, SJL, and C57BL/6 female mice at 6-8 weeks of age were pur- 
chased from Jackson Laboratories (Bar Harbour). Mice were vaccinated with 50 1g 
DNA vaccine in saline without adjuvant by the im. route or with 11g PIV 
vaccines with 100,1g alum (Alhydrogel; Brenntag Biosector, Denmark) adjuvant 
by the im. or s.c. routes in a 10011 volume and were then challenged at week 4 by 
the iv. route with 10° viral particles (10 plaque-forming units (PFU)) ZIKV-BR 
or ZIKV-PR. Animals were randomly allocated to groups. Immunologic and viro- 
logic assays were performed blinded. Sample size was determined to achieve 80% 
power to detect significant differences in protective efficacy. All animal studies 
were approved by the BIDMC Institutional Animal Care and Use Committee 
(IACUC). 

DNA vaccines. ZIKV strain BeH815744 (accession number KU365780) was 
used to design transgenes, which were produced synthetically. Sequences were 
optimized for enhanced transgene expression. Pre-membrane and envelope 
(prM-Env; defined as amino acids 216-794 of the polyprotein) or Env alone 
were cloned into the mammalian expression plasmid pcDNA3.1* (Invitrogen). 
Deletion mutants lacked the transmembrane (ATM) or stem (Astem) 
regions of Env. A Kozak sequence and the Japanese encephalitis virus leader 
sequence were included”. Plasmids were produced with Machery-Nagel 
endotoxin-free gigaprep kits. Sequences were confirmed by double-stranded 
sequencing. 

PIV vaccine. The ZIKV purified inactivated virus (PIV, also termed ZPIV) 
vaccine was produced at the Pilot Bioproduction Facility, Walter Reed Army 
Institute of Research, Silver Spring, MD, USA. The PIV vaccine was based on the 
Puerto Rican PRVABCS59 isolate, which was obtained from the US Centers for 
Disease Control and Prevention, Fort Collins, CO, USA. The Vero cells used for 
passage and vaccine production were a derivative of a certified cell line manufac- 
tured at The Salk Institute, Swiftwater, PA. After inoculation, virus was collected on 
days 5 and 7, clarified by centrifugation and depth filter (0.45-0.2 1m), and treated 
with benzonase. The viral harvest was concentrated with an ultrafilter followed 
by purification using Captocore chromatography resin. The purified ZIKV was 
then inactivated with formalin (0.05%) at 22°C for 7 days. Following inactivation, 
formalin was removed by dialysis, and the antigen concentration was adjusted. The 
final PIV vaccine was assessed for infectivity by passage in Vero cells followed by 
plaque assays to demonstrate inactivation. 

ZIKV challenge stocks. ZIKV stocks were provided by University of Séo Paulo, 
Brazil (Brazil ZKV2015; ZIKV-BR!!) and the US Centers for Disease Control and 
Prevention, USA (Puerto Rico PRVABC59; ZIKV-PR). Both strains were passage 
number 3. Low-passage-number Vero cells were then infected at a multiplicity 
of infection (MOI) of 0.01 PFU per cell. Supernatant was screened daily for viral 
titers and collected at peak growth. Culture supernatants were clarified by centrif- 
ugation, and fetal bovine serum was added to 20% final concentration (v/v) and 
stored at —80°C. The concentration and infectivity of the stocks were determined 
by RT-PCR and PFU assays. The viral particle to PFU ratio of both stocks was 
approximately 1,000. 

RT-PCR. Cap genes of available ZIKV genomes were aligned using Megalign 
(DNAstar), and primers and probes to a highly conserved region were designed 
using primer express v3.0 (Applied Biosystems). Primers were synthesized by 
Integrated DNA Technologies (Coralville) and probes by Biosearch Technologies 
(Petaluma). To assess viral loads, RNA was extracted from serum with a QIAcube 
HT (Qiagen). Reverse transcription and RT-PCR were performed as previously 
described'®. The wild-type ZIKV BeH815744 Cap gene was used as a standard 
and was cloned into pcDNA3.1+ vector, and the AmpliCap-Max T7 High Yield 
Message Maker Kit was used to transcribe RNA (Cellscript). RNA was purified 
using the RNA clean and concentrator kit (Zymo Research), and RNA quality and 
concentration was assessed by the BIDMC Molecular Core Facility. Log dilutions 
of the RNA standard were reverse-transcribed and included with each RT-PCR 
assay. Viral loads were calculated as virus particles per ml. Assay sensitivity was 
100 copies per ml. The infectivity of virus in peripheral blood from ZIKV chal- 
lenged mice was confirmed by PFU assays. 

PFU assay. Vero WHO cells were seeded in a MW6 plate to reach confluency at 
day 3. Cells were infected with log dilutions of ZIKV for 1h and overlayed with 
agar. Cells were stained after 6 days of infection by neutral red staining. Plaques 
were counted, and titers were calculated by multiplying the number of plaques by 
the dilution and divided by the infection volume. 

Western blot. To assess transgene expression from DNA vaccines, cell lysates 
obtained 48 h following lipofectamine 2000 (Invitrogen) transient transfection 
of 293T cells were mixed with reducing sample buffer, heated for 5 min at 100°C, 
cooled on ice, and run on a precast 4-15% SDS-PAGE gel (Biorad). Protein was 
transferred to PVDF membranes using the iBlot dry blotting system (Invitrogen), 
and the membranes were blocked overnight at 4°C in PBS-T (Dulbeco’s phosphate 
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buffered saline + 0.2% V/V Tween 20 + 5% W/V non-fat milk powder). Following 
overnight blocking, the membranes were incubated for 1h with PBS-T contain- 
ing a 1:5,000 dilution of mouse anti-ZIKV Env monoclonal antibody (BioFront 
Technologies). Membranes were then washed 3 times with PBS-T and incubated 
for 1h with PBS-T containing a 1:1,000 dilution of rabbit anti-mouse horseradish 
peroxidase (HRP) (Jackson ImmunoResearch). Membranes were then washed 
3 times with PBS-T and developed using the Amersham ECL plus western blotting 
detection system (GE Healthcare). 

ELISA. Mouse ZIKV Env ELISA kits (Alpha Diagnostic International) were used 
to determine endpoint antibody titers using a modified protocol. 96-well plates 
coated with ZIKV Env protein were first equilibrated at room temperature with 
300 il of kit working wash buffer for 5 min. 6 jl of mouse serum was added to the 
top row, and threefold serial dilutions were tested in the remaining rows. Samples 
were incubated at room temperature for 1h, and plates washed 4 times. 10011 of 
anti-mouse IgG HRP-conjugate working solution was then added to each well 
and incubated for 30 min at room temperature. Plates were washed 5 times, devel- 
oped for 15 min at room temperature with 100 1l of 3,3’,5,5’-tetramehylbenzidine 
(TMB) substrate, and stopped by the addition of 100 il of stop solution. Plates were 
analysed at 450 nm / 550nm ona VersaMax microplate reader using Softmax Pro 
6.0 software (Molecular Devices). ELISA endpoint titers were defined as the highest 
reciprocal serum dilution that yielded an absorbance >2-fold over background 
values. 

Neutralization assay. A high-throughput ZIKV microneutralization (MN) assay 
was developed for measuring ZIKV-specific neutralizing antibodies as a modi- 
fied version of a qualified dengue virus microneutralization assay used in clinical 
dengue vaccine trials!”. Briefly, serum samples were serially diluted threefold in 
96-well micro-plates, and 10011 of ZIKV-PR containing 100 PFU were added to 
100 il of each serum dilution and incubated at 35°C for 2h. Supernatants were then 
transferred to microtiter plates containing confluent Vero cell monolayers (World 
Health Organization, NICSC-011038011038). After incubation for 4 d, cells were 
fixed with absolute ethanol: methanol for 1h at —20°C and washed three times 
with PBS. The pan-flavivirus monoclonal antibody 6B6-C1 conjugated to HRP 
(6B6-Cl was a gift from J. T. Roehrig, CDC) was then added to each well, incubated 
at 35°C for 2h, and washed with PBS. Plates were washed, developed with 3,3’,5,5/ 
—tetramethylbenzidine (TMB) substrate for 50 min at room temperature, stopped 
with 1:25 phosphoric acid, and absorbance was read at 450 nm. For a valid assay, the 
average absorbance at 450 nm of three non-infected control wells had to be < 0.5, 
and virus-only control wells had to be > 0.9. Normalized absorbance values were 
calculated, and the MN50 titer was determined by a log mid-point linear regression 
model. The MNS0 titer was calculated as the reciprocal of the serum dilution that 
neutralized > 50% of ZIKV. Seropositivity was defined as a titer > 1:10. 
ELISPOT. ZIKV-specific cellular immune responses were assessed by IFN 
ELISPOT assays using pool of overlapping 15-amino-acid peptides covering the 
prM or Env proteins (JPT). 96-well multiscreen plates (Millipore) were coated 
overnight with 100,11 per well of 10j.g ml! anti-mouse IFN~ (BD Biosciences) 
in endotoxin-free Dulbecco’s PBS (D-PBS). The plates were then washed three 
times with D-PBS containing 0.25% Tween 20 (D-PBS-Tween), blocked for 2h 
with D-PBS containing 5% FBS at 37°C, washed three times with D-PBS-Tween, 
rinsed with RPMI 1640 containing 10% FBS to remove the Tween 20, and incu- 
bated with 21g ml“! of each peptide and 5 x 10° mouse splenocytes in triplicate in 
100 il reaction mixture volumes. Following 18h incubation at 37 °C, the plates were 
washed nine times with PBS-Tween and once with distilled water. The plates were 
then incubated with 21g ml biotinylated anti-mouse IFN+ (BD Biosciences) for 
2h at room temperature, washed six times with PBS-Tween, and incubated for 2h 
with a 1:500 dilution of streptavidin-alkaline phosphatase (Southern Biotechnology 
Associates). Following five washes with PBS-Tween and one with PBS, the plates 
were developed with nitroblue tetrazolium-5-bromo-4-chloro-3-indolyl-phosphate 
chromogen (Pierce), stopped by washing with tap water, air dried, and read using 
an ELISPOT reader (Cellular Technology Ltd). The numbers of spot-forming cells 
(SFC) per 10° cells were calculated. The medium background levels were typically 
<15 SFC per 10° cells. 

Intracellular cytokine staining. ZIKV-specific CD4* and CD8* T-lymphocyte 
responses were assessed using splenocytes and analysed by flow cytometry. Cells 
were stimulated for 1h at 37°C with 21g ml“! of overlapping 15-amino-acid 
peptides covering the prM or Env proteins (JPT). Following incubation, brefeldin-A 
and monensin (BioLegend) were added, and samples were incubated for 6h at 
37°C. Cells were then washed, stained, permeabilized with Cytofix/Cytoperm (BD 
Biosciences). Data was acquired using an LSR II flow cytometer (BD Biosciences) 
and analysed using FlowJo v.9.8.3 (Treestar). Monoclonal antibodies included: 
CD4 (RM4-5), CD8a (53-6.7), CD44 (IM7), and IFNy (XMG1.2). Antibodies were 
purchased from BD Biosciences, eBioscience, or BioLegend. Vital dye exclusion 
(LIVE/DEAD) was purchased from Life Technologies. 
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IgG purification and adoptive transfer. Serum was collected from prM-Env 
DNA-vaccinated mice or naive mice, and polyclonal IgG was purified using 
protein G purification kits (Thermo Fisher Scientific). Varying amounts of puri- 
fied IgG was infused by the i.v. route into naive recipient mice before ZIKV 
challenge. 

CD4* and CD8t T-lymphocyte depletion. Anti-CD4 (GK1.5) and/or anti-CD8 
(2.43) (Bio X Cell) monoclonal antibodies were administered at doses of 500 1g per 


mouse to prM-Env DNA vaccinated mice by the i-p. route on day —2 and day —1 
before ZIKV challenge. Antibody depletions were >99.9% efficient as determined 
by flow cytometry. 

Statistical analyses. Analysis of virologic and immunologic data was performed 
using GraphPad Prism version 6.03 (GraphPad Software). Comparisons of groups 
was performed using t-tests and Wilcoxon rank-sum tests. Correlations were 
assessed by Spearman rank-correlation tests. 
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Extended Data Figure 1 | ZIKV maximum likelihood phylogenetic tree. The ZIKV-BR and ZIKV-PR challenge isolates are depicted with red arrows. 
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Extended Data Figure 2 | ZIKV amino acid sequence comparisons. Number of and percentage of amino acid differences in the polyprotein are shown 
for the following ZIKV isolates: Brazil/ZKV2015 (Brazil strain; ZIKV-BR challenge stock), PRVABC59 (Puerto Rico strain; ZIKV-PR challenge stock), 
BeH815744 (Brazil strain; immunogen design), H/PF/2013 (French Polynesian strain), and MR766 (African strain). 
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Extended Data Figure 3 | prM-specific antibody responses in DNA-vaccinated mice. In the experiment described in Fig. 2, humoral immune responses 
were assessed at week 3 following vaccination by prM-specific ELISA. Red bars reflect medians. 
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Extended Data Figure 4 | Immunogenicity and protective efficacy of 10° viral particles (10? PFU) ZIKV-BR. Humoral immune responses were 
prM-Env DNA vaccine in SJL mice. SJL mice (n =5 per group) received assessed at week 3 after vaccination by Env-specific ELISA (top). Red bars 
a single immunization by the i.m. route with 50 jig prM-Env DNA vaccine reflect medians. Serum viral loads are shown following ZIKV-BR challenge 
or a sham vaccine and were challenged at week 4 by the i.v. route with (bottom). 
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Extended Data Figure 5 | Protective efficacy of prM-Env DNA vaccine in C57BL/6 mice. C57BL/6 mice (n =5 per group) received a single 
immunization by the i.m. route with 50 jug prM-Env DNA vaccine or a sham vaccine and were challenged at week 4 by the i.v. route with 10° viral 
particles (10? PFU) ZIKV-BR or ZIKV-PR. Serum viral loads are shown following challenge. 
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Extended Data Figure 6 | Protective efficacy of various DNA vaccines in C57BL/6 mice. C57BL/6 mice (n =5 per group) received a single 
immunization by the im. route with 50 1g of various DNA vaccines and were challenged at week 4 by the i.v. route with 10° viral particles (10? PFU) 
ZIKV-BR. Serum viral loads are shown following challenge. 
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Extended Data Figure 7 | Adoptive transfer of low titers of Env-specific IgG. Serum viral loads in mice that received adoptive transfer of low titers of 


Env-specific IgG (as defined in Fig. 3a) and were then challenged with ZIKV-BR. 
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Extended Data Figure 8 | CD4* and CD8* T-lymphocyte depletion. CD4* and/or CD8* T-lymphocyte depletion following monoclonal antibody 
treatment of Balb/c mice vaccinated with prM-Env DNA. 
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Pancreatic stellate cells support tumour metabolism 
through autophagic alanine secretion 


Cristovao M. Sousa!, Douglas E. Biancur!, Xiaoxu Wang", Christopher J. Halbrook*, Mara H. Sherman’, Li Zhang’, 
Daniel Kremer?, Rosa F. Hwang’, Agnes K. Witkiewicz>®, Haoqiang Ying’, John M. Asara®, Ronald M. Evans’, 


Lewis C. Cantley’, Costas A. Lyssiotis?!° & Alec C. Kimmelman!} 


Pancreatic ductal adenocarcinoma (PDAC) is an aggressive 
disease characterized by an intense fibrotic stromal response and 
deregulated metabolism!*. The role of the stroma in PDAC biology 
is complex and it has been shown to play critical roles that differ 
depending on the biological context*!°. The stromal reaction also 
impairs the vasculature, leading to a highly hypoxic, nutrient-poor 
environment*!)!”, As such, these tumours must alter how they 
capture and use nutrients to support their metabolic needs!?. 
Here we show that stroma-associated pancreatic stellate cells 
(PSCs) are critical for PDAC metabolism through the secretion 
of non-essential amino acids (NEAA). Specifically, we uncover a 
previously undescribed role for alanine, which outcompetes glucose 
and glutamine-derived carbon in PDAC to fuel the tricarboxylic 
acid (TCA) cycle, and thus NEAA and lipid biosynthesis. This shift 
in fuel source decreases the tumour’s dependence on glucose and 
serum-derived nutrients, which are limited in the pancreatic tumour 
microenvironment*!!. Moreover, we demonstrate that alanine 
secretion by PSCs is dependent on PSC autophagy, a process that 
is stimulated by cancer cells. Thus, our results demonstrate a novel 
metabolic interaction between PSCs and cancer cells, in which 
PSC-derived alanine acts as an alternative carbon source. This 
finding highlights a previously unappreciated metabolic network 
within pancreatic tumours in which diverse fuel sources are used to 
promote growth in an austere tumour microenvironment. 

We previously demonstrated that metabolism is rewired in pancreatic 
cancer cells to facilitate biosynthesis and maintain redox balance in the 
nutrient-poor conditions of a pancreatic tumour”!*», While extra- 
cellular protein can provide nutrients to the starved cancer cells'?)3, 
we hypothesized that the stroma may provide additional avenues of 
metabolic support for the tumour. Pancreatic stellate cells (PSCs) 
are a predominant cell type in the pancreatic tumour stroma and are 
important mediators of the desmoplastic response. Their abundance 
suggests that they may contribute to the metabolism of cancer cells. 
To test this idea, we assessed changes in the oxygen consumption rate 
(OCR) and extracellular media acidification rate (ECAR), measures 
of mitochondrial activity and glycolysis, respectively, in PDAC cells 
treated with conditioned medium from a well characterized human 
PSC (hPSC) line!® (Fig. 1a, b and Extended Data Fig. la-e). PDAC 
glycolysis showed minimal changes when cells were treated with PSC- 
conditioned medium, as measured by ECAR (Extended Data Fig. 1d, e). 
By contrast, we observed a consistent increase of 20-40% in the basal 
OCR after treatment with hPSC medium (Fig. 1a, b and Extended 
Data Fig. la—c), a feature that was independent of serum during the 
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Figure 1 | Pancreatic stellate cells secrete metabolites that fuel 
pancreatic cancer metabolism. a, Conditioned medium (CM) from 
hPSCs increases PDAC OCR (green line), as compared to cells treated with 
PDAC CM (red line) or control (DMEM with 10% serum, black line). 

A representative trace showing change in OCR during a mitochondrial 
stress test. Error bars depict s.d. of 6 independent wells from a 
representative tracing from 6 independent experiments (depicted in b). 
b, Per cent change in basal OCR for 8988T cells treated with conditioned 
medium from different cell lines relative to 8988T cells treated with 
standard culture medium. Error bars depict s.e.m. of pooled independent 
experiments (n = 3 for primary hPSC #1, #2, primary mPSC; n=4 for 
hPSC#2, IMR90 and MiaPaCa2; n =6 for 8988T, hPSC#1). c, OCR activity 
of PSC-conditioned medium is retained after heating at 100°C for 15 min. 
Error bars, s.e.m. of independent experiments (n = 4). d, Metabolites that 
were significantly elevated in PSC-conditioned medium, decreased in 
double-conditioned medium (PSC-conditioned medium added to 8988T 
cells and then collected), and elevated intracellularly in PDAC cells treated 
with PSC-conditioned medium. Error bars, s.d. (n= 3). e, A mixture of 
NEAAs (1 mM alanine, aspartate, asparagine, glycine, glutamate, proline and 
serine) or alanine alone increases PDAC OCR. Data are normalized to cells 
treated with standard culture medium. Error bars, s.e.m. of independent 
experiments (n= 4). f, The concentration of alanine was measured in 
conditioned medium samples using liquid chromatography with tandem 
mass spectrometry (LC-MS/MS). Error bars, s.d. (n = 3). Significance 
determined with one-way ANOVA in b, ¢, e; t-test in d, f. Panels d, f, n= 3 
technical replicates from independently prepared samples from individual 
wells. * P< 0.05, ** P< 0.01, *** P< 0.001. The calculated P values and 
comparisons are reported in Supplementary Information. 
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conditioning process (Extended Data Fig. 1f, g) and reproducible with 
multiple primary specimens (Fig. 1b and Extended Data Fig. 1h, i). 
Notably, this metabolic phenotype was specific to pancreatic cancer 
cells; non-transformed pancreatic ductal epithelial cells did not exhibit 
increased OCR in response to PSC medium (Extended Data Fig. 1)). 

To identify the nature of the PSC-secreted factors that alter PDAC 
metabolism, we subjected conditioned medium to three freeze-thaw 
cycles (—80°C, 60°C) or heating (100°C, 15 min) and observed that 
it retained the ability to increase PDAC OCR (Fig. 1c and Extended 
Data Fig. 1k-m), indicating that the factor(s) lacked tertiary structure. 
Moreover, the activity was retained in medium passed through a 3-kDa 
cut-off filter (Extended Data Fig. 1n). The small size and resistance to 
extreme temperatures of the OCR-increasing factor(s) excluded large 
candidate molecules such as proteins, indicating that the factor(s) could 
be metabolites. 

We next performed a series of metabolomic studies!” to identify 
metabolites that were secreted by PSCs and taken up by PDAC cells 
(Extended Data Fig. 2a). Specifically, we sought molecules that were 
over-represented in PSC medium (and therefore secreted by PSCs); 
under-represented in the PSC medium after contact with PDAC 
cells (removed by PDAC cells); and over-represented inside PDAC 
cells treated with the PSC medium (taken up by PDAC cells). Of 
approximately 200 metabolites analysed, only the non-essential amino 
acids (NEAA) alanine and aspartate followed this pattern (Fig. 1d and 
Extended Data Fig. 2b). Treatment of PDAC cells with the individual 
amino acids revealed that only alanine had the ability to increase PDAC 
OCR to a degree comparable to that of PSC-conditioned medium 
(Fig. le and Extended Data Fig. 2c). 

To demonstrate that PSC-derived metabolites, and alanine 
specifically, were being taken up by PDAC cells, we performed 
metabolite tracing experiments in which PSCs were grown to saturation 
in medium containing uniformly carbon-13-labelled (U'C-)glucose 
and U'%C-glutamine to label secreted NEAAs (Extended Data 
Fig. 2d). The conditioned medium from the labelled cells was then 
added to PDAC cells, allowing us to track the production and secretion 
of alanine by the PSCs and to follow its depletion from PSC medium 
upon contact with PDAC cells (Extended Data Fig. 2e). Quantification 
revealed that secreted alanine in the PSC medium reached millimolar 
concentration within 24h of conditioning (Fig. 1f and Extended Data 
Fig. 2f-i), and this occurred independent of serum in the medium 
(Extended Data Fig. 2h). In the context of a tumour in vivo, the more 
important parameters to assess are the release and uptake of alanine 
relative to other nutrients. Accordingly, we performed kinetic studies 
and found that alanine was secreted by the PSCs at the greatest rate 
of the 14 amino acids measured and more rapidly than even lactate 
(Extended Data Fig. 2j, k). Alanine was also one of only two amino 
acids to accumulate in PDAC cells, achieving an enrichment of greater 
than fivefold (Extended Data Fig. 21). 

We next investigated how PDAC cells metabolize PSC-derived 
alanine. Alanine could increase OCR by fuelling the TCA cycle, anda 
likely route for this is through transamination of alanine into pyruvate 
(Fig. 2a). Consistently, depletion of the cytosolic or mitochondrial 
alanine transaminase (GPT1 or GPT2, respectively) in PDAC cells 
(Extended Data Fig. 3a) resulted in an increase in accumulation of 
alanine (Fig. 2b) and a decrease in OCR in PSC medium-treated 
PDAC cells (Fig. 2c and Extended Data Fig. 3b, c). Moreover, direct 
addition of pyruvate to PDAC cells increased OCR (Fig. 2d). We then 
performed U-'°C-Ala tracing studies to assess how alanine was being 
used in PDAC metabolism. Treatment of cells with 1 mM U-¥C-Ala 
led to a 5-10-fold increase in the intracellular alanine pool (Extended 
Data Fig. 3d—-h). Carbon from alanine did not contribute to upstream 
glycolytic intermediates (Extended Data Fig. 3i, j) or alter glycolytic 
flux (Extended Data Fig. 3k) or the NAD*/NADH ratio (Extended 
Data Fig. 31, m), and it contributed minimally to the intracellular lactate 
pool (Extended Data Fig. 3d, n), irrespective of the extracellular glucose 
concentration (Extended Data Fig. 30-q). These results suggested 
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Figure 2 | Alanine is secreted by stellate cells and is used by PDAC to 
fuel biosynthetic reactions. a, Role of the alanine transaminases (GPT1/2) 
in central carbon metabolism. Glucose (Glc), pyruvate (Pyr), lactate (Lac), 
Ac-CoA, citrate (Cit), and a-ketoglutarate (wKG). b, Knockdown of 
GPT1 or GPT2 in PDAC cells results in a further increase in intracellular 
alanine upon treatment with PSC-conditioned medium. Error bars depict 
s.d. (n= 3). Data are normalized to cells treated with standard culture 
medium for each shRNA. c, Knockdown of GPT1 or GPT2 in PDAC cells 
significantly attenuates the ability of PSC-conditioned medium to increase 
OCR. Data normalized to cells treated with standard culture medium. 
Error bars depict s.e.m. of independent experiments (n = 3). d, Alanine 
treatment of PDAC cells increases OCR in a dose-dependent fashion and 
can be recapitulated by pyruvate. Data normalized to cells treated with 
standard culture medium. Error bars depict s.d. of 4 independent wells 
from a representative experiment (of 3 experiments). e, Alanine-derived 
carbon labelling patterns of metabolites in PDAC cells treated with 
U-C-Ala demonstrate substantial label incorporation into the TCA cycle 
metabolites citrate, isocitrate (Iso), malate (Mal) and fumarate (Fum). 
Error bars depict s.d. (1 =3). f, U-'8C-alanine labelling of citrate in a panel 
of PDAC cell lines represented as fraction of citrate with labelled carbon. 
Error bars depict s.d. (1 = 3). g, h, U-8C-Ala labelled PDAC cells show 
substantial incorporation of alanine into the de novo biosynthesis of the 
fatty acids palmitate (g) and stearate (h). Data presented as the sum of 

all isotopomers containing alanine-derived label. Error bars depict s.d. 
(n=4). Significance determined with one-way ANOVA in b-d. Panels 

b, f, n =3; panels g, h, n = 4 technical replicates from independently 
prepared samples from individual wells. * P< 0.05, ** P< 0.01, 

**%* P< ().001. The calculated P values and comparisons are reported in 
Supplementary Information. 


that alanine-derived pyruvate was being used in the mitochondria, 
because it was not affecting cytosolic glycolytic metabolism. Indeed, 
alanine was a major carbon source for the TCA cycle; '°C was markedly 
incorporated into citrate and isocitrate, and, to a lesser extent, malate 
and fumarate (Fig. 2e and Extended Data Figs 4, 5), as well as into 
the NEAAs aspartate and glutamate (Extended Data Fig. 4-5), which 
can be biosynthesized from TCA cycle intermediates in PDAC cells". 
Indeed, citrate was one of the major recipients of alanine carbon, with 
labelling ranging from 23% to 46% among PDAC lines (Fig. 2f and 
Extended Data Figs 4, 5). 

We also observed that the alanine-derived pyruvate competed 
meaningfully with mitochondrial but not cytosolic glucose-derived 
pyruvate based on the citrate (M2, where M refers to metabolite 
and 2 the number of °C atoms present) and lactate (M3) labelling 
patterns following U-'°C-Ala or U-'3C-glucose tracing studies 
(Extended Data Figs 3d, i, j, o-q, 4, 5, 6a). These results illustrate 
that alanine carbon is being used selectively to fuel mitochondrial 
metabolism and are consistent with our earlier observations that the 
addition of alanine does not disrupt glycolysis (Extended Data Fig. 3i-k). 
A principle function of mitochondrial pyruvate in proliferating cells is to 
contribute to citrate generation via its conversion to acetyl coenzyme A 
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(Ac-CoA). This citrate (labelled with the two Ac-CoA carbons, M2) is 
then released into the cytosol and used as a building block for fatty acid 
biosynthesis. As shown in Fig. 2g, h, more than 20% of total palmitate 
and 10% of total stearate contained alanine-derived °C, rivalling the 
contribution derived from glucose (around 50% and 20%, respectively; 
Extended Data Fig. 6b-g), and illustrating the importance of alanine 
in lipid biosynthesis. The production of citrate from alanine for fatty 
acid biosynthesis requires an additional TCA cycle input to support 
anaplerosis. Indeed, we observed that alanine could also serve in this 
role through pyruvate carboxylase, as evidenced by M3 labelling of 
citrate and other TCA cycle intermediates from U-!3C-Ala (Extended 
Data Figs 4, 5). 

We also found that addition of alanine increased flux through the 
serine biosynthetic pathway in PDAC cells (Extended Data Fig. 6h-j). 
This pathway uses glucose to generate the NEAAs serine and glycine, 
which are used for, among other processes, the de novo biosynthesis 
of nucleic acids. U-'°C-glucose tracing studies confirmed that alanine 
promoted flux of glucose-derived carbon through this pathway 
(Extended Data Fig. 6h), an effect that was more pronounced under the 
glucose-limiting conditions, which resemble poorly perfused pancre- 
atic tumours (Extended Data Fig. 6j). Collectively, these results demon- 
strated that alanine-derived carbon could supplant glucose-derived 
carbon in TCA cycle metabolism (to make lipids and NEAAs), thereby 
enabling glucose to be used for additional biosynthetic functions 
such as the serine biosynthetic pathway. Glutamine metabolism was 
also altered in the presence of exogenous alanine, with more alanine 
being incorporated into the TCA cycle in place of glutamine carbon 
(Extended Data Fig. 6k, 1), but this occurred to a much lower degree 
relative to glucose-derived carbon in most PDAC lines. 

One possible source of PSC-secreted alanine was protein catabolism. 
Given the critical role of autophagy in PDAC!*", we tested whether 
the secretion of alanine was autophagy-dependent. PSCs in culture 
exhibit readily detectable levels of basal autophagy, as demonstrated 
by LC3 puncta (representing autophagosomes; Extended Data Fig. 7a), 
and exhibit appreciable levels of autophagic flux (Fig. 3a—c and 
Extended Data Fig. 7b-d). Notably, co-culture with PDAC cells 
(Fig. 3a, b), or treatment with PDAC-conditioned medium (Fig. 3c and 
Extended Data Fig. 7d), significantly increased autophagic flux in PSCs. 
To assess the importance of autophagy in the secretion of alanine, we 
inhibited autophagy in PSCs using short hairpin (sh)RNAs targeting 
the essential autophagy genes ATG5 and ATG7 (Extended Data 
Fig. 7b, c, e, f) and measured the ability of the PSC-conditioned medium 
to increase PDAC OCR. The ability of PSC medium to increase OCR 
in PDAC cells was abolished upon knockdown of autophagy genes 
(Fig. 3d and Extended Data Fig. 7g) and could be rescued by addition of 
exogenous alanine (Extended Data Fig. 7h). Consistently, we observed 
a drop in the levels of both intracellular alanine and secreted alanine 
when autophagy was impaired in hPSCs (Fig. 3e and Extended Data 
Fig. 7i-m). Furthermore, intracellular alanine accumulation in 
PDAC cells was also diminished following treatment with autophagy- 
impaired PSC-conditioned medium (Fig. 3f). In contrast to PDAC cells, 
which respond robustly to autophagy inhibition'®*!”, PSC proliferation 
was largely insensitive to autophagy impairment (Extended Data 
Fig. 7n). Even under nutrient-depleted conditions, PSC survival was not 
altered when autophagy was impaired (Extended Data Fig. 70, p), indi- 
cating that altered PSC survival was not the cause of decreased alanine 
secretion. In sum, these data provide evidence for two-way intratumoral 
metabolic cross-talk, in which cancer cells release a signal to PSCs that 
results in the induction of autophagy and the release of alanine. 

We next sought to determine whether these PSC-induced 
metabolic alterations in PDAC would have an impact on PDAC 
proliferation. Not surprisingly, in high-nutrient medium (25 mM 
glucose, 4mM Gln, full serum), the impact of PSC medium 
on PDAC proliferation was not significant (Extended Data 
Fig. 8a). By contrast, when grown in a low-nutrient setting (serum- 
free or glucose-limiting conditions), recapitulating the austere 
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Figure 3 | Alanine secretion is dependent on stellate cell autophagy. 

a, Representative images of basal autophagic flux in PSCs using an 

LC3 tandem fluorescence (GFP-RFP) reporter. PDAC cells are not 
fluorescently labelled and are indicated by asterisks. b, Autophagy in PSCs 
is increased upon co-culture with multiple PDAC lines, demonstrated 

by increased autolysosomes. Error bars, s.e.m. of n = 32 for hPSC alone; 
n= 38 for 8988T co-culture; n = 41 for Tu8902 co-culture; n = 24 for 
MiaPaCa2 co-culture pooled from 3 independent experiments. 

c, Quantification of autophagic flux in stellate cells treated with 
conditioned medium from PDAC lines. Error bars, s.e.m. of n= 65 
hPSC-conditioned medium; n = 50 8988T-conditioned medium; n= 61 
Tu8902-conditioned medium; n = 62 MiaPaca2-conditioned medium 

in independent hPSC cells per condition pooled from 3 independent 
experiments. d, Suppression of PSC autophagy by ATG5 or ATG7 
knockdown attenuates the ability of PSC-conditioned medium to 
increase OCR in 8988T PDAC cells. Data normalized to cells treated with 
standard culture medium. Error bars show s.e.m. of pooled experiments 
(n=6 for ATG7 #1, #2; n=5 for shGEP, 8988T conditioned medium; 

n=3 for ATGS #1, #2). e, PSC-conditioned medium contains elevated 
alanine, which is significantly decreased when autophagy is inhibited. 

f, PDAC cells treated with PSC conditioned medium display elevated 
intracellular alanine, which is significantly suppressed when autophagy 
is inhibited in PSCs. Error bars for e, f, show s.d. of n =3 technical 
replicates from independently prepared samples from individual wells. 
Significance determined with two-way ANOVA in b, c; one-way ANOVA 
in d-f. * P< 0.05, ** P< 0.01, *** P< 0.001. The calculated P values and 
comparisons are reported in Supplementary Information. 


microenvironment in vivo, there was a significant positive effect of 
PSC medium on PDAC proliferation (Fig. 4a and Extended Data 
Fig. 8b-f). Notably, PSC medium from autophagy-impaired cells 
did not produce this growth-promoting effect (Fig. 4a and Extended 
Data Fig. 8d-i). Supplementation of serum-free or low-glucose 
medium with alanine was also able to rescue growth in a manner 
similar to PSC medium (Fig. 4a and Extended Data Fig. 8d-f, j-l), 
while having a minimal effect in nutrient-rich settings (Extended Data 
Fig. 8a). Moreover, alanine was able to restore the growth-promoting 
ability of PSC medium from autophagy-impaired cells (Extended 
Data Fig. 8g). Consistent with the proposed mechanism, pyruvate (the 
product of the transamination of alanine), but not lactate, was able to 
sustain PDAC proliferation in glucose-limited medium (Extended Data 
Fig. 8j-l). Similarly, GPT1 knockdown in PDAC cells grown under 
serum-free conditions significantly attenuated the effects of PSC 
medium or alanine on proliferation (Extended Data Fig. 8m). 

To determine whether this cross-feeding mechanism was operative 
in vivo, we developed a co-injection system in which PSCs could be 
manipulated genetically and then implanted alongside cancer cells 
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Figure 4 | Stellate cell metabolite secretion supports PDAC growth 
under nutrient-limiting conditions and facilitates tumour growth. 

a, PSC-conditioned medium increases proliferation of 8988T cells grown 
in serum-free medium, a feature recapitulated with exogenous alanine 
supplementation. By contrast, conditioned medium from PSCs where 
autophagy is inhibited is unable to support PDAC growth. The addition 
of 10% serum is included as a positive control. Error bars, s.e.m. (n =7) 
independent experiments. b, c, Co-injection of 8988T PDAC cells with 
PSCs significantly enhances early tumour growth, analysed at 35 days 
post-injection (b), and decreases tumour-free survival (c), in a xenograft 
model. This effect is significantly attenuated when autophagy is suppressed 
in the PSCs. Error bars, s.e.m. (1 = 10) tumours per condition per time 
point, except for the PSC-shGFP control, for which (1 =5) animals were 
injected. d, e Co-injection of inducible KRAS mouse PDAC cells with 
mPSCs in the pancreas of syngeneic mice significantly enhances early 
tumour growth at 7 days post-injection (d) and decreases tumour-free 
survival (e) in an orthotopic syngeneic graft model, whereas this effect is 
significantly attenuated when autophagy is suppressed in the PSCs. Error 
bars, s.e.m. (1 = 10) tumours per condition per time-point, except for the 
mPSC-shGFP control, for which (n= 5) animals were injected. f, Model 
of tumour-stroma metabolic cross-talk. Significance determined with a 
one-way ANOVA in a, b, d; log-rank Mantel-Cox test in c, e. * P< 0.05, 
** P< 0.01, *** P< 0.001. The calculated P values and comparisons are 
reported in Supplementary Information. 


into the flanks of mice. Although previous studies have shown that 
co-injection of PSCs with PDAC cells promotes tumour growth'®, the 
contribution of metabolic cross-talk to this effect has not been explored. 
The fact that autophagy is required for PSC alanine secretion, while 
having a minimal impact on proliferation and no effect on survival 
(Extended Data Fig. 7n-p), permits selective attenuation of this 
metabolic cross-talk so that its role in tumour growth can be assessed. 
We therefore performed co-injection studies with limiting numbers 
of PDAC cells along with autophagy-competent or incompetent PSCs 
(Extended Data Fig. 9a). Consistent with the in vitro proliferation 
data, both tumour take and tumour growth kinetics were increased 
significantly when co-injected with autophagy-competent PSCs, and 
this increase was significantly attenuated when PDAC cells were co- 
injected with autophagy-incompetent PSCs (Fig. 4b, c and Extended 
Data Fig. 9b-e). Notably, the predominant effect of PSC growth support 
in these assays appears to be during tumour initiation or tumour 
take (Fig. 4b, c and Extended Data Fig. 9c, d); at later time points the 
autophagy-incompetent PSC-containing tumours approximate their 
autophagy-competent counterparts (Extended Data Fig. 9b, e). Indeed, 
the cancer cells ultimately form the majority of the grafted tumours, 
even with the wild-type PSCs (Extended Data Fig. 9f, g). Quantification 
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of stellate cells by western blot for RFP (exogenous PSCs were marked 
with RFP) showed that while the levels of RFP expression varied by 
individual tumour, there were no significant differences between 
groups based on autophagy status (Extended Data Fig. 9h, i). 

Consistent with the results from the subcutaneous model, 
diminution of autophagy in PSCs slowed tumour growth and increased 
tumour-free survival in orthotopic assays (Extended Data Fig. 10a-d). 
The orthotopic tumours contained more PSCs than the subcutane- 
ous tumours, but again PSCs were less abundant than tumour cells 
(Extended Data Fig. 10e). To further validate the in vivo relevance 
of this mechanism, we used a syngeneic system consisting of cancer 
cells and PSCs derived from our murine PDAC autochthonous 
models°. Consistent with our findings using human PDAC and PSC 
lines, murine PSCs improved tumour engraftment in an autophagy- 
dependent manner (Fig. 4d, e and Extended Data Fig. 71, m). 

Our work demonstrates growth-promoting metabolic cross-talk 
between stromal PSCs and epithelial cancer cells (Fig. 4f). Previous 
work has illustrated that intratumoral metabolic cross-talk can occur 
between different populations of cells in a tumour”®”!. For example, 
hypoxic cancer cells in colorectal and lung cancer models preferentially 
consume glucose and provide well-oxygenated cancer cells with lactate 
for oxidative metabolism*'. More recently, others have suggested that 
stromal cells can alter the metabolism of pancreatic cancer cells”” and 
that this can occur with exosomes as a vehicle for non-selective metab- 
olite delivery*®. While alterations in lactate and alanine were recently 
reported using imaging studies during pancreatic cancer progression 
in mouse models, these studies did not have the spatial resolution to 
distinguish between different cell types within the tumour and would 
not be able to detect metabolic cross-talk between cancer cells and 
fibroblasts”*. Our work, using pancreatic cancer model systems and 
in vivo co-transplantation assays, demonstrate a specific role of PSC- 
derived alanine, whose carbon competes with glucose-derived carbon 
as a metabolic substrate—effects not recapitulated with exogenous 
lactate. 

Protein catabolized through autophagy is the source of secreted alanine 
from PSCs. Given that alanine is the second most represented amino 
acid in proteins, catabolism can provide an important source of free 
alanine*. We found that PDAC cells stimulate autophagy in PSCs and 
selectively consume the released alanine, using this carbon in the TCA 
cycle following its conversion to pyruvate. Notably, alanine-derived 
pyruvate does not equilibrate with cytosolic pyruvate. As such, it 
provides a direct mitochondrial carbon source without changing 
the cytosolic NADt/NADH redox balance. This results in increased 
mitochondrial oxygen consumption as well as lipid biosynthesis. 
Importantly, the metabolism of alanine allows traditional carbon 
sources such as glucose to be more available for other biosynthetic 
processes, such as the synthesis of serine and glycine, which are 
precursors for nucleic acid biosynthesis. This cooperative metabolism 
promotes tumour growth and is particularly relevant in nutrient- 
limited conditions. 

According to this mechanism, we would predict that PSC-mediated 
alanine secretion would be less relevant under circumstances in which 
pancreatic cancer cells are able to access nutrients through the vascu- 
lature directly, such as those in PSC-ablated pancreatic tumours®’. In 
contrast, alteration of PSC features (inhibiting autophagy, for example) 
without their ablation would allow the continued, PSC-mediated 
restraint of tumour growth while potentially sensitizing the tumour 
to treatment. A recent example of this strategy demonstrated that 
stromal reprogramming of PDAC with vitamin D sensitized them to 
chemotherapy”. 

Collectively, these findings are important for the following reasons. 
First, they shed light on the complex biology of tumour-stroma 
interactions. Furthermore, they highlight the importance of studying 
the metabolism of highly desmoplastic tumours such as PDAC in the 
proper context, in this case in the presence of relevant supporting cell 
types. Last, this cooperative metabolism between cancer cells and PSCs, 
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which is dependent on autophagy, further substantiates the critical 
role of metabolic scavenging in PDAC and suggests that inhibition of 
autophagy and other lysosomal degradation pathways may have an even 
greater therapeutic utility in this disease than previously appreciated. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 

Cell culture. The cell lines 8988T, MiaPaCa2, Tu8902, Pancl, MPanc96 and IMR90 
were obtained from ATCC or the DSMZ. hPSCs (hPSC#1) have been previously 
described!°. hPSC#2 was isolated from an untreated human PDAC resection and 
considered de-identified ‘surgical waste’ tissue under IRB approved protocols 
03-189 and 11-104. Patients gave informed consent for tissue collection. Stromal 
cells that outgrew the cancer cells in culture were isolated by differential trypsiniza- 
tion and immortalized by infection with hTERT and SV40gp6 (Addgene plasmids 
#22396 and #10891, respectively) retro-viruses. These cells were kept in DMEM 
(Life Technologies 11965) supplemented with 10% FBS and 1% Pen/Strep (Life 
Technologies 15140). Primary human pancreatic cancer-associated fibroblasts were 
isolated from tumour resections in a similar manner as above, under IRB approved 
protocol STU 102010-051, but were not immortalized. Cells were kept in DMEM 
supplemented with 10% CCS (Thermo Scientific) and 1% Pen/Strep. PSCs were 
verified by measuring Desmin and SMA expression. HPDE have been previously 
described and grown as indicated”®. All cells were routinely tested for mycoplasma 
by PCR and PDAC lines were typically authenticated by fingerprinting as well 
as visual inspection and carefully maintained in a centralized cell bank. mPSCs 
were isolated from normal mice pancreata and purified by centrifugation using a 
Nycodenz gradient and activated by in vitro culture, as described”. 

Black 6 (B6) mPSCs were generated from B6 females (Taconic, B6NTac) 
harbouring mouse PDAC. These animals were pre-treated with doxycycline diet 
and kept in doxycycline regimen for the duration of the experiment and were 
injected with 5 x 10° iKRAS mPDAC cells! into the pancreas. Pancreatic tumours 
were resected at 2 weeks, digested in collagenase and dispase and mechanically 
minced. Cells were plated in cell culture dishes in DMEM (Gibco) with 15% 
FBS in the absence of doxycycline to limit the growth of iKRAS mouse PDAC 
cells. mPSCs were immortalized by infection with hTERT and SV40gp6 (Adgene 
plasmids #22396 and #10891, respectively) retro-viruses. Cells were kept in DMEM 
(Gibco) with 10% FBS and 1% Pen/strep. 

mCherry-hPSCs are hPSC#1 labelled with mCherry through infection with a 
lentivirus expressing mCherry. 

Conditioned medium was generated by adding fresh medium to cells at >50% 
confluence. Medium was harvested 48 h later and passed through 0.45-|1m filters. 
For size cut-off experiments conditioned medium was filtered through 3-kDa 
cutoff columns (EMD Millipore, UFC900308). Concentrated (>3 kDa) medium 
was resuspended in a DMEM volume matching the initial medium volume. Boiled 
medium experiments were performed by heating conditioned medium at 100°C 
for 15 min followed by filtration at 0.45 1m to remove precipitate. Freeze-thaw 
medium was treated by 3 consecutive cycles of 15 min at —80°C followed by 15 min 
at 60°C and then filtered to remove precipitate. 

The tandem fluorescence LC3-reporter stable hPSC cells were generated by 
retroviral infection of hPSCs with pBABE-puro mCherry-EGFP-LC3B (Addgene 
plasmid #22418). For autophagic flux quantification experiments, 7.5 x 104 
hPSC-LC3 cells were plated in 12-well plates with cover slips and 3 x 10° PDAC 
or hPSC cells were added 4h later. Cover slips were fixed in 4% paraformaldehyde 
(ThermoFisher, 28908) after cells had been in contact for 24h. Coverslips were 
mounted in DAPI containing mounting solution (Life Technologies P36931). Cells 
were imaged on a Yokogawa Spinning Disk Confocal in FITC, RFP and DAPI 
channels. The ratio of red:yellow puncta was determined by counting puncta using 
the Cell Counter image] plugin. 

Oil-red O staining was performed on cells plated on glass cover slips and fixed 
24h after plating in 4% paraformaldehyde (Thermo-Fisher, 28908) for 15 min. Cells 
were rinsed with PBS followed by a rinse with 60% isopropanol and stained with 
freshly prepared Oil Red O working solution comprised of 3 ml of 0.5% solution 
(Sigma, 01391) and 2 ml of H2O for 15 min, rinsed with 60% isopropanol and 
counterstained with Heamatoxylin. Cover slips were then washed in HO, mounted 
in Vectashield and imaged using a Leica DM2000 bright-field microscope. 
Growth assays. Growth curves were obtained as previously described!”. Cell 
growth over 48h was assessed in clear bottom 96-well plates (Costar 3603, Corning 
Incorporated) by CellTiter-Glo (Promega G7572) analysis 48h post treatment with 
conditioned medium or metabolites and determined by the mean of at least three wells 
per condition. Luminescence was measured on a POLARstar Omega plate reader. 
Metabolism experiments. OCR and ECAR experiments were performed using 
the XF-96 apparatus from Seahorse Bioscience. Cells were plated (16,000 cells 
per well for 8988T; 20,000 cells for Tu8902 or Panc-1; 50,000 cells for HPDE) 
in at least quadruplicate for each condition the day before the experiment. The 
next day, medium was completely replaced with conditioned medium (75 1l of 
conditioned medium and 251] of fresh medium) or fresh medium containing 
either 1 mM L-alanine (Sigma A7469), 1mM of NEAAs (Gibco 11140), 1 mM 
glycine (Sigma G8790), 1mM aspartate (Sigma A4534) or 1 mM cysteine (Sigma 
A9165). 20h later, medium was replaced by reconstituted DMEM with 25 mM 
glucose and 2mM glutamine (no sodium bicarbonate) adjusted to pH~7.4 and 


incubated for 30 min at 37% in a CO>-free incubator. For the mitochondrial stress 
test (Seahorse 101706-100), oligomycin, FCCP and rotenone were injected to a 
final concentration of 2|1M, 0.5|1M and 4\.M, respectively. For the glycolysis stress 
test (Seahorse 102194-100), glucose, oligomycin and 2-deoxyglucose were injected 
toa final concentration of 10mM, 21M and 100 mM, respectively. OCR and ECAR 
were normalized to cell number as determined by CellTiter-Glo analysis at the end 
of the experiments. 

Metabolomics. Steady-state metabolomics experiments were performed as 
previously described". Briefly, PDAC cell lines were grown to ~80% confluence 
in growth medium (DMEM, 2 mM glutamine, 10 mM glucose, 10% CCS) on 6cm 
dishes in biological triplicate. A complete medium change was performed two 
hours before metabolite collection. To trace the effect of alanine on glutamine and 
glucose metabolism, PDAC cell lines were grown as above and then transferred 
into glutamine-free (with 10 mM glucose) or glucose-free (with 2mM glutamine) 
DMEM containing 10% dialysed FBS and supplemented with either 2mM U-'C- 
glutamine (-+ 1 mM alanine) or 10mM U-C-glucose (+ 1 mM alanine), respec- 
tively, overnight. To trace alanine metabolism, PDAC cell lines were grown as above 
and then transferred into DMEM (with 10mM glucose, 2mM glutamine, 10% 
dialysed FBS) and supplemented with 1 mM U-!%C-alanine overnight. Additionally, 
fresh medium containing the labelled metabolite was exchanged 2h before metabo- 
lite extraction for steady-state analyses. To trace glucose metabolism in low-glucose 
conditions, cells were grown in 0.5 mM of glucose and medium was refreshed every 
8h for the 24h labelling period to achieve steady-state labelling. For all metabo- 
lomics experiments, the quantity of the metabolite fraction analysed was adjusted 
to the corresponding protein concentration calculated upon processing a parallel 
well in a 6-cm dish. 

To collect labelled conditioned medium, hPSC or 8988T cells were grown for 
three passages in DMEM containing 10 mM U-'°C-glucose, 2mM U-C-Gln and 
10% dialysed FBS. This medium was then replaced by DMEM with unlabelled 
glucose, glutamine and 10% dialysed FBS, and incubated for 48h, filtered and 
processed for metabolite extraction. Metabolite extraction of medium was 
performed by adding 20011 of filtered fresh conditioned medium to 800 1l of 
cold (-80°C) methanol, incubated at —80°C for 30 min followed by centrifu- 
gation at 10,000g for 10 min at 4°C. The resultant supernatant was lyophilized 
by speedvac and stored at —80°C until analysis. Dried metabolite pellets were 
re-suspended in 20,11 LC-MS grade water, 511 were injected onto a Prominence 
UFLC and separated using a 4.6 mm i.d. x 100mm Amide XBridge HILIC column 
at 360 11 per minute starting from 85% buffer B (100% ACN) to 0% B over 16 min. 
Buffer A: 20mM NH,OH/20 mM CH;COONH, (pH = 9.0) in 95:5 water/ACN. 
287 selected reaction monitoring (SRM) transitions were captured using positive/ 
negative polarity switching by targeted LC-MS/MS using a 5500 QTRAP hybrid 
triple quadrupole mass spectrometer. 

For kinetics of metabolite secretion by hPSCs, triplicate samples of subconfluent 
hPSCs cultured under normal conditions were changed to fresh DMEM with 10% 
dialysed FBS, which was allowed to condition for 2, 4, 8, 24, 48, or 72h. Fresh 
DMEM with 10% dialysed FBS was used as a blank control. Metabolites were then 
extracted from conditioned medium by adding ice cold 100% MeOH to a final 
concentration of 80% MeOH. 

For PDAC metabolite uptake kinetics, conditioned DMEM with 10% dialysed 
FBS from subconfluent hPSCs was collected after 48h of culture, and then filtered 
through a 0.45 1m filter. 8988T PDAC cells were plated in triplicate and treated with 
the PSC-conditioned medium or fresh DMEM with 10% dialysed FBS for 1, 2, 4, 8, 
or 24h. The medium was removed and the cell lysate harvested with ice cold 80% 
MeOH. The soluble metabolite fractions were cleared by centrifugation, dried under 
nitrogen, then resuspended in 50:50 MeOH:H20 mixture for LC-MS analysis. 

For the kinetic analyses, a Shimadzu Nexera X2 UHPLC combined with a 
Sciex 5600 Triple TOFMS was used, which was controlled by Sciex Analyst 1.7.1 
instrument acquiring software. A Supelco Ascentis Express HILIC (7.5cm x 3mm, 
2.7 um) column was used with mobile phase (A) consisting of 5 mM NH,4OAc and 
0.1% formic acid; mobile phase (B) consisting of 98% CAN, 2% 5mM NH4OAc 
and 0.1% formic acid. Gradient program: mobile phase (A) was held at 10% for 
0.5 min and then increased to 50% in 3 min; then to 99% in 4.1 min and held for 
1.4min before returning initial condition. The column was held at 40°C and 51 
of sample was injected into the LC-MS with a flow rate of 0.4 ml/min. 

Calibrations of TOFMS were achieved through reference APCI source with 
average mass accuracy of less than 5 ppm except for alanine, which was 20 ppm. Key 
MS parameters were the collision energy and spread of 25 eV and 10eV for positive 
product ion acquisition and —35eV and 15eV for negative acquisition. 100 MRM 
transitions were set on the MS method. Data Processing Software included Sciex 
PeakView 2.2, MasterView 1.1, LibraryView (64 bit) and MultiQuant 3.0.2. 

For analysis of palmitate and stearate, PDAC cells in log growth were labelled in 
biological quadruplicate with either 5.5mM U-'3C-glucose or 1mM U-13C-Ala in 
DMEM containing 2mM glutamine and 10% dialysed FBS for 3 days. Unlabelled 
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species were used at equivalent concentrations, where relevant. Labelled medium 
was refreshed every day. At 72h, medium was refreshed for 2h, and samples were 
collected by quick rinse in ddH,0 followed by liquid nitrogen quenching directly on 
cells. Plates were then stored at —80°C before extraction. Polar metabolites and fatty 
acids were extracted using methanol/water/chloroform, as described’. Samples 
were placed on ice and 10,11 of 1.2mM D27 myristic acid as internal standard 
was introduced to each cell plate. 4001 of cold water and 400 1l of methanol were 
added to each sample. Cells were collected in a centrifuge tube and 400 1] of ice-cold 
chloroform was added to each tube. Extracts were vortexed at 4°C for 30 min and 
centrifuged at 14,000xG for 20 min at 4°C. The lower (organic) phase was recovered, 
and samples were nitrogen dried before reconstitution in 50,11 of Methyl-8 reagent 
(Thermo) at 60°C for 1h to generate fatty acid methyl esters (FAMEs). 

GC-MS analysis was performed using an Agilent 7890A GC equipped with 
a 30m DB-5MS+DG capillary column and a Leap CTC PAL ALS as the sample 
injector. The GC was connected to an Agilent 5975C quadrupole MS operating 
under positive electron impact ionization at 70 eV. Tunings and data acquisition were 
done with ChemStation E.02.01, PAL Loader 1.1.1, Agilent Pal Control Software 
Rev A and Pal Object Manager updated firmware. MS tuning parameters were opti- 
mized so that PFTBA tuning ion abundance ratios of 69:219:512 were 100:114:12, 
increasing high ion abundance. Agilent Fiehn retention time locking (RLT) GC 
method was used and calibrated with standard FAMEs (Agilent) and confirmed 
with Agilent G1677AA Fiehn GC/MS metabolomics RTL Library. For measurement 
of FAMEs, the GC injection port was set at 250°C and GC oven temperature was 
held at 60°C for 1 min and increased to 320°C at a rate of 10°C/minute, then held 
for 10 min under constant flow with initial pressure of 10.91 psi. The MS source 
and quadrupole were held at 230°C and 159°C, respectively, and the detector was 
run in scanning mode, recording ion abundance in the range of 35-600 m/z with 
solvent delay time of 5.9 min. Data extraction was done with Agilent MassHunter 
WorkStation Software GCMS Quantitative Analysis Version B.07. Additional isotope 
correction was performed using an in-house software tool from MATLab”**. 

All °C isotopic reagents were purchased from Cambridge Isotope Laboratories. 
Quantitative PCR. Total RNA was extracted using TRIzol (Invitrogen) and reverse 
transcription was performed from 21g of total RNA using oligo-dT and MMLV 
HP reverse transcriptase (Epicentre), according to the manufacturer’ instructions. 
Quantitative RT-PCR was performed with SYBR Green dye using an Mx3000PTM 
instrument (Stratagene). PCR reactions were performed in triplicate and the 
relative amount of cDNA was calculated by the comparative CT method using the 
18S ribosomal or actin RNA sequences as a control. 

Antibodies. LC3B (Novus Biologicals NB100-2220) was used for IF at a 1:200 
dilution. Secondary anti-rabbit-GFP antibody (Invitrogen A21206) was used at 
1:200. For western blot ATG5 (Novus Biologicals NB110-53818), ATG7 (Sigma 
A2856), B-actin (Sigma A5441), LC3B (Novus Biologicals NB600-1384), RFP 
(Rockland 600-401-379), and secondary HRP conjugated anti-rabbit (Thermo- 
Fisher, 31460) and anti-mouse (Thermo-Fisher 31430) antibodies were used, as 
described. For IHC analysis, SMA (Dako M0851) was used at 1:500 followed 
by anti-mouse-HRP secondary antibody (Vector labs PK6101). 

Lentiviral mRNA targets. shRNA vectors were obtained from the RNA 
Interference Screening Facility of Dana-Farber Cancer Institute. The sequences 
and/or RNAi Consortium clone IDs for each shRNA are as follows: shGFP: 
GCAAGCTGACCCTGAAGTTCAT (Addgene plasmid #30323); shATGS5 #1: 
TRCN0000150645 (sequence: GATTCATGGAATTGAGCCAAT); shATGS #2: 
TRCN0000150940 (sequence: GCAGAACCATACTATTTGCTT); shATG7 #1: 
TRCN0000007584 (sequence: GCCTGCTGAGGAGCTCTCCAT); shATG7 
#2: TRCN0000007587 (sequence: CCCAGCTATTGGAACACTGTA); shGPT1 
#1: TRCN0000034979 (sequence: GCAGTTCCACTCATTCAAGAA); shGPT1 
#2: TRCN0000034983 (sequence: CTCATTCAAGAAGGTGCTCAT); shGPT2 
#1: TRCN0000035024 (sequence: CGGCATTTCTACGATCCTGAA); shGPT2 
#2: TRCN0000035025 (sequence: CCATCAAATGGCTCCAGACAT). Mouse 
shATGS5: TRCN0000099430 (sequence: GCCAAGTATCTGTCTATGATA); mouse 
shATG7 TRCN0000092163 (sequence: CCAGCTCTGAACTCAATAATA). 
Chemicals. U-'°C-labelled glucose (Cambridge Isotope Laboratories CLM- 
1396-10), U-3C labelled L-glutamine (Cambridge Isotope Laboratories 
CLM-1822-H-0.1), U-!3C labelled t-alanine (Cambridge Isotope Laboratories 
CLM-2184-H-0.1). NEAAs (Gibco 11140), p-glucose (Sigma G7528), L-glutamine 
(Sigma G3126), L-alanine (Sigma A7469), glycine (Sigma G8790), L-serine (Sigma 
$4311), sodium pyruvate (Sigma P5280), sodium L-lactate (Sigma L7022), 
chloroquine (Waterstone technology 32152). 

Primer sequences. Sequences for qPCR primers are as follows: «SMA_Fw, 
GTGTTGCCCCTGAAGAGCAT, aSMA_Rv: GCTGGGACATTGAAAGTCTCA, 
Desmin_Fw: TCGGCTCTAAGGGCTCCTC, Desmin_Rv: CGTGGTCAGAAA 
CTCCTGGTT, GPT1_Fw: GTGCGGAGAGTGGAGTACG, GPT1_Rv: GATGAC 
CTCGGTGAAAGGCT, GPT2_Fw: CATGGACATTGTCGTGAACC, GPT2_Rv: 
TTACCCAGGACCGACTCCTT. 
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Kits. Mitochondrial stress test (Seahorse 101706-100) and glycolysis stress test 
(Seahorse 102194-100) kits were purchased from Seahorse Bioscience. NAD*/ 
NADH kit was purchased from Biovision (Biovision K337-100) and used according 
to the manufacturer’s instructions. 

Statistical analysis. Statistical analysis was done using GraphPad PRISM software. 
No statistical methods were used to predetermine sample size. 

When comparing multiple groups with more than one changing variable 
(for example, experiments where cells were treated with different shRNAs and 
with different conditioned media) a two-way ANOVA test was performed. For 
experiments where we analysed one variable for multiple conditions, a one-way 
ANOVA was performed. In both cases, ANOVA analyses were followed by Tukey’s 
post hoc tests to allow multiple group comparisons. Survival curve statistical 
analysis was performed using the log-rank (Mantel-Cox) test. When comparing 
two groups to each other, a Student's t-test (unpaired, 2-tailed) was performed. 

Groups were considered significantly different when P < 0.05. The relevant 
calculated P values are reported in Supplementary Information, where detailed 
statistical information for each experiment can also be found. 

Ultrasound tumour monitoring. Tumours were identified and dimensions and 
volume were measured as previously described using high-resolution ultrasound 
(Vevo 770)’. Briefly, mice were anaesthetized using 3% isoflurane, and abdominal 
fur was removed using fine clippers and depilatory cream. Pre-warmed sterile 
saline (100-200 11) was administered via intraperitoneal injection. Ultrasound 
gel was applied over the abdominal area and the ultrasound transducer was used 
to identify abdominal landmark organs (liver/spleen) followed by the pancreas 
and the tumour. Once identified, the transducer was transferred to the 3D motor 
stage and a 3D scan was performed for measurement of tumour dimensions and 
volume. Tumour volumes were contoured as described”. 

Trypan blue-exclusion assays. To determine cell viability in starved conditions, 
cells were plated in complete medium at 50% confluency. Once the cells were 
attached, medium was replaced with serum-free DMEM. 48h later, cells were 
trypsinized, re-suspended in their own medium, diluted in trypan blue (Thermo- 
scientific 15250061) and counted using a haematocytometer. The percentage of 
dead cells was determined by trypan blue incorporation. 

Xenografts. Xenograft studies were performed as described previously“. Briefly, 
2 x 10° 8988T or MiaPaCa2 cells were either injected alone or co-injected into the 
flanks of nude female mice at 6 weeks of age (Taconic ncrnu-f) with 1 x 10° hPSCs 
previously infected with shGFP, shATG5 or shATG7 shRNAs under protocol 10-055. 
Tumour take was monitored visually and by palpation bi-weekly. Tumour diameter 
and volume were calculated based on caliper measurements of tumour length and 
height using the formula tumour volume = (length x width’)/2. Animals were 
considered to have a tumour when the maximal tumour diameter was over 2mm. 

For syngeneic orthotopic injections, black6 female mice at 12 weeks of age 
(Taconic B6NTac), pre-conditioned with doxycycline diet and kept in doxycy- 
cline regimen for the duration of the experiment, were injected in the pancreas 
with 1 x 10° iKRAS mPDAC cells isolated from a pure black6é PDAC GEMM 
(KrasG12D, P53 L/+)}° either alone or co-injected with 5 x 10° mPSCs that were 
previously infected with shGFP, shATGS5 or shATG7 shRNAs (or mPSC-shGFP 
was used alone as a negative control). Briefly, an incision was made on the flank, 
above the spleen. The spleen was identified and gently pulled out through the 
incision to expose the pancreas. 10 11 of cell suspension containing 20% of Matrigel 
(BD-Biosciences 354234) was injected in the tail of the pancreas using a Hamilton 
syringe that was held in place for 30s to allow Matrigel polymerization. The spleen 
and pancreas were carefully re-introduced in the animal and the peritoneum 
sutured. The wound was clipped with surgical staples and the animals were allowed 
to recover for 1 week until the beginning of weekly ultra-sound monitoring of 
tumour take and progression. 

Human PDAC orthotopic injections were performed in a similar way, by 
injecting 5 x 10° MiaPaCa2 and/or 1 x 10° hPSC #1 infected with shGEP, shATG5 
or shATG7 shRNAs into the tail of the pancreas of nude female mice (Taconic 
ncrnu-f) at 8 weeks of age. Animals were considered as tumour-positive when a 
mass detected in the pancreas reached a volume of at least 1 mm? as calculated by 
3D ultrasound. All animal studies were not blinded or randomized. Studies were 
performed under DFCI IACUC protocol # 10-055, where the maximal tumour 
size allowed is less than 2 cm. 
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Extended Data Figure 1 | See next page for caption. 
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Extended Data Figure 1 | Pancreatic stellate cells secrete metabolites 
that PDAC utilize to fuel their metabolism. a—c, Conditioned medium 
(CM) from human pancreatic stellate cell (PSC) lines (hPSC#1 and 
hPSC#2) increases oxygen consumption (OCR) in multiple PDAC cell 
lines: a, Tu8902, b, MiaPaCa2 and c, Panc-1. Data are represented as per 
cent increase in OCR in cells treated with conditioned medium versus cells 
treated with fresh DMEM containing 10% serum. Error bars represent the 
s.e.m. of n=5 for a and n=3 for b, c, except hPSC#1-conditioned medium 
in b where n= 4, from independent experiments. One-way ANOVA was 
performed. a, ***P = 0.0004 for hPSC#1 versus control, P=0.0018 for 
hPSC#2 versus control; b, ***P < 0.0001; c, *P= 0.0293. d, e, Extracellular 
acidification rate (ECAR) is not significantly altered in 8988T cells 

when treated with conditioned medium from different cell lines. 

A representative Seahorse trace is shown in d. Error bars show 

s.d. of 6 independent wells from a representative tracing from 

6 independent experiments (depicted in e). e, Results using conditioned 
medium from multiple PDAC and PSC lines including primary hPSCs. 
Error bars represent the s.e.m of n =3 for MiaPaCa2, IMR-90, primary 
hPSC#1 and #2, and n=6 for 8988T conditioned medium and hPSC#1, 

in independent experiments. f, g, Conditioned medium from hPSC#1 and 
hPSC#2 harvested in serum-free conditions retains the capacity to increase 
OCR in 8988T (f) and Tu8902 (g) cell lines. Data are represented as per cent 
increase in OCR in cells treated with conditioned medium versus cells 
treated with fresh DMEM without serum. Error bars represent the s.e.m 

of 3 independent experiments. 1-way ANOVA. f, ***P < 0.0001; 

g, ***P=0.0005 for hPSC#1 versus control, P < 0.0001 for hPSC#2 versus 
control. h, Characterization of PSCs (primary hPSC #1 and #2 and hPSC#1 
and #2) by RT-qPCR. Primary PSCs from tumours display an activated 
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stellate cell signature in a similar fashion to that of the hPSC lines, 

as evidenced by the high levels of expression of the activated fibroblast 
marker, smooth muscle actin (aSMA), and the stellate cell marker, desmin. 
mRNA levels are represented as fold change compared to 8988T cells. 
IMR90 (human fibroblasts derived from lung tissue) and human PSCs 
derived from disease-free pancreata (primary hPSC N) are included as 
controls. Note: the transfer of normal PSCs to the tissue culture setting 
also leads to activation. Expression levels are normalized to B-actin. 
Error bars represent the s.d. of triplicate wells. i, Activated hPSCs are 
devoid of lipid droplets, an indicator of the activated state, as illustrated 
using oil red O staining. Tu8902 is included as a positive staining 
control. j, Conditioned medium from hPSCs does not alter OCR of non- 
transformed pancreatic ductal cells (HPDE). Error bars represent the s.d. 
of quintuplicate wells from a representative experiment (of 3 independent 
experiments). One-way ANOVA: P > 0.9. k-m, The ability of hPSC- 
conditioned medium to increase PDAC OCR is retained after boiling at 
100°C for 15 min in both Tu8902 (k) and MiaPaCa2 (1) as well as after 
three consecutive freeze (—80°C, 10 min)-thaw (60°C, 10 min) cycles in 
8988T, as depicted in m. Error bars represent the s.d. of 4 independent 
wells from representative experiments (of 3 experiments). One-way 
ANOVA. k, *P= 0.0004 for hPSC#1 boiled conditioned medium versus 
control, P=0.0001 for hPSC#1-conditioned medium versus control; 

1, ***P = 0.0002 for hPSC boiled versus control, P< 0.0001 for hPSC 
versus control; m, *P= 0.0011. n, The factor secreted by hPSCs that 
increases PDAC OCR is retained in the <3-kDa fraction of hPSC- 
conditioned medium. Error bars represent the s.e.m of 3 independent 
experiments. One-way ANOVA. *P=0.0175, ***P = 0.0007 for hPSC 
versus control, P=0.0015 for <3kDa versus control. 
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Extended Data Figure 2 | See next page for caption. 
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Extended Data Figure 2 | Alanine is secreted by pancreatic stellate 
cells and consumed by PDAC cells. a, Schematic of the metabolomic 
experiments depicted in Fig. 1d. b, The amino acids alanine, glutamate, 
proline and asparagine are differentially secreted by PSCs and consumed 
by PDAC cells. The alanine data here are also presented in Fig. 1d. 

Error bars represent the s.d. of n =3 technical replicates from 
independently prepared samples from individual wells. t-tests were 
performed, for alanine: *P=0.0176, **P=0.0097; for glutamate, 

*P= 0.0108, ***P=0.0024; for proline, *P= 0.0013 for hPSC#1 conditioned 
medium versus 8988T-conditioned medium, P = 0.0024 for double- 
conditioned medium versus 8988T-conditioned medium; for asparagine, 
*P=0.0125. c, A mixture of non-essential amino acids (1 mM of NEAA: 
alanine, asparagine, aspartate, glutamate, proline and serine) increases 
MiaPaCa2 OCR ina similar fashion to PSC-conditioned medium. 
Among these NEAAs, only alanine can increase PDAC OCR to an extent 
comparable to the NEAA mixture. Data are normalized to cells treated 
with fresh DMEM with 10% serum. Error bars represent the s.e.m of 3 
experiments. ***P < 0.0001. d, Profile of the U-'°C-glucose and U-!3C- 


glutamine derived NEAA secretome of hPSCs. Error bars represent the s.d. 


of n = 3 technical replicates from independently prepared samples from 
individual wells. e, PSCs were labelled to saturation with U-'°C-glucose 
and U-!3C-glutamine. Alanine was the only labelled metabolite that 
showed a statistically significant increase in the PSC-conditioned medium 
(compared to 8988T-conditioned medium) and a decrease in the double 
conditioned medium (PSC-conditioned medium added to 8998T cells). 
Error bars, s.d. of n= 3 technical replicates from independently prepared 
samples from individual wells. A two-tailed t-test was performed, 
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***P <().0001. f, Alanine standard curve as determined by LC-MS/MS. 
Data points for conditioned medium from hPSC (green diamond), 

8988T (red triangle) and Tu8902 (red square) are displayed on the 

alanine standard curve. These data are presented in Fig. 1f in 1.M per 

10° cells. Error bars represent the s.d. of n =3 technical replicates from 
independently prepared samples from individual wells. t-test, 

*** P< 0.0001. g, h, Alanine is secreted by PSCs in the presence (g) or 
absence (h) of serum. Error bars represent the s.d. of n = 3 technical 
replicates from independently prepared samples from individual wells. 
t-test performed; g, ***P < 0.0001 for control versus hPSC#1, P= 0.0007 
for control versus hPSC#2; h, ***P = 0.0003 for control versus hPSC#1, 
P=0.0004 for control versus hPSC#2. i, The rate of PSC Ala secretion into 
conditioned medium was determined over a 72-h period using LC-MS; 
error bars represent s.d. of n = 4 technical replicates from independently 
prepared samples from individual wells. j, k, The levels of amino acids 

(j) and lactate (k) in complete medium conditioned by hPSC#1 were 
monitored over a 72-h period. Alanine was the most relatively secreted 
metabolite and surpassed even lactate. Metabolite levels are normalized 

to time 0 (fresh DMEM with 10% dialysed serum). Error bars represent the 
s.d. of n=4 technical replicates from independently prepared samples 
from individual wells. Two-way ANOVA was performed; ***P < 0.0001. 
The same data are used for alanine and presented in curves in i-k. 

1, Alanine was the most avidly consumed amino acid by 8988T cells treated 
with hPSC#1-conditioned medium. Error bars represent the s.d. of n= 4 
technical replicates from independently prepared samples from individual 
wells. Two-way ANOVA was performed; ***P < 0.0001 for the last 
time-point. 
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Extended Data Figure 3 | Alanine secreted by stellate cells is used by PDAC 
to fuel biosynthetic reactions. a—c, Knockdown of GPT1 or GPT2 in PDAC 
cells (a) significantly attenuates the ability of hPSC-conditioned medium 

to increase OCR in Tu8902 cells (b). This observation was repeated with 
conditioned medium from an independent hPSC line in 8988T cells (c). Error 
bars represent the s.e.m of 3 independent experiments. One-way ANOVA; 

b, *P=0.0134; c, *P=0.0129. d-j, Metabolic tracing studies using U-#C-Ala 
and U-C-pyruvate (Pyr) (f-h). d, Metabolic tracing studies using U-'°C-Ala 
in 8988T cells. Error bars, s.d. of n= 3. Two-way ANOVA was performed: 

for alanine, ***P < 0.0001; for lactate, ***P =0.0001, **P=0.0086. 

e, Intracellular accumulation and labelling of alanine in Tu8902 cells treated 
with 1 mM alanine. Two-way ANOVA was performed: ***P < 0.0001. 

f-h, Intracellular accumulation and labelling of alanine in 8988T cells treated 
with 1 mM pyruvate or 1 mM alanine grown in media containing different 
glucose (Glc) concentrations (0.5, 10, 25mM). Error bars represent s.d. of 
n=3. i,j, U-C-Ala does not contribute to glycolysis or gluconeogenesis as 
seen for the metabolites glucose 6-phosphate (G6P), fructose 6-phosphate 
(F6P), fructose bis-phosphate (FBP), glyceraldehyde 3-phosphate (Ga3P), 


3-phosphoglycerate (3PG), and phosphoenolpyruvate (PEP) in 8988T 

(i) or Tu8902 (j) PDAC cell lines. Label can be incorporated in lactate 

(Lac) independent of glycolysis. Error bars represent s.d. of n=3.k, 1mM 
alanine does not significantly increase basal extra cellular acidification 

rate (ECAR) of 8988T cells. Error bars represent s.d. of 6 replicates. Data 
presented for a representative experiment (of 3 experiments), P= 0.6082. 

1, m, Alanine does not alter the NAD+/NADH ratio in 8988T cells to the 
same extent as pyruvate in medium containing either 0.5 mM (I) or 25mM 
(m) extracellular glucose. 10mM pyruvate and 10 mM lactate were included 
as controls; error bars represent s.e.m of n=4 (I) or n=3 (m) independent 
experiments. t-test performed; I, *P= 0.0205; m, *P=0.0291. n-q, Alanine 
contributes minimally to lactate in both Tu8902 (n) and 8988T (0-q) cells, 
and independently of glucose concentrations in medium, as seen by tracing 
of U-13C-Ala: 0.5mM (0), 10mM (p) or 25mM (q) glucose. Pyruvate labels 
~50% of the lactate pool in 8988T cells (o—q) independent of the glucose 
concentration in the medium. Error bars represent the s.d. of n=3. 

d-j, n—o, n=3 technical replicates from independently prepared samples 
from individual wells 
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Extended Data Figure 4 | See next page for caption. 
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Extended Data Figure 4 | Alanine fuels the TCA cycle in 8988T 

PDAC cells. a, 1 mM U-13C-Ala labelling of 8988T cells for 24h shows 
incorporation of alanine carbon into the TCA cycle metabolites citrate 
(Cit), isocitrate (Iso), fumarate (Fum), malate (Mal) and NEAAs, 
aspartate and glutamate, derived thereof. These data are presented in 

Fig. 2 as fractional labelling (Fig. 2e) and as the percentage of the 

citrate pool incorporating the label (Fig. 2f). MO refers to an unlabelled 
metabolite with no heavy carbons (unlabelled isotopomer, 12C), M1 is an 
isotopomer with one heavy (°C) carbon that can be in any position in the 
molecule, M2 is a metabolite with any two heavy carbons (1C), M3 with 
3 and so on. The maximal M for a given species represents the fully 

13C labelled isotopomer (for example, for citrate that has a 6-carbon 
skeleton, it would be M6). In the schematic illustration, U-!°C-Ala is 
represented as three red balls, each depicting a labelled carbon atom. This 
is converted into U-'3C-Pyr (M3) and shuttled into the mitochondria. 
The conversion of U-'°C-Pyr to Ac-CoA results in the loss of one carbon 


as CO. Ac-CoA is then added to oxaloacetate (OAA) to form 2-3C 
labelled citrate (M2). This citrate traverses around the TCA cycle and is 
metabolized into the other TCA cycle metabolites. The carbon labelling 
patterns are indicated with red (labelled) and white (unlabelled) balls. 
akG, a-ketoglutarate; Succ, succinate. b-e, Contribution of alanine and 
pyruvate to TCA cycle metabolites citrate (b), isocitrate (c), fumarate 

(d) and malate (e) as shown by U-'8C-Ala and U-!9C-Pyr tracing in 8988T 
cells in medium containing 10mM glucose and 2mM glutamine. f-k, 
Contribution of alanine to TCA cycle metabolites citrate (f), isocitrate 

(g), fumarate (h) and malate (i) as well as the NEAAs Glu (j) and Asp (k) is 
independent of glucose concentration in the medium, as shown by U-!3C- 
Ala tracing in medium containing 0.5 mM, 10mM or 25 mM glucose. Data 
are presented as total ion currents; error bars represent the s.d. of n=3 
technical replicates from independently prepared samples from individual 
wells. The raw data for b-k are presented in Supplementary Information 
Fig. 2a-j. 
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Extended Data Figure 5 | Alanine carbon contributes to the TCA cycle 
in multiple PDAC cell lines. a~d, 1 mM U-!3C- Ala labelling of Tu8902 
(a), MiaPaCa2 (b), Panc-1 (c) and mPanc96 (d) cells show incorporation 
of alanine carbon into the TCA cycle metabolites citrate, isocitrate, malate 
and fumarate and the NEAAs Asp and Glu. Data are presented as total 
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ion currents; error bars represent the s.d. of n = 3 technical replicates 
from independently prepared samples from individual wells. These data 
are presented in Fig. 2f as the percentage of the citrate pool incorporating 
label. 
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Extended Data Figure 6 | See next page for caption. 


Extended Data Figure 6 | Alanine relieves the demand of PDAC cells 
on glucose and glutamine carbon so that it can fuel other biosynthetic 
processes. a, The addition of alanine to PDAC cells labelled with U-¥C- 
glucose significantly increases the unlabelled citrate, with a corresponding 
reduction in the labelled (M2) citrate. A t-test was performed; 

*** P — 0.0009, *P = 0.0169. b-c, Alanine is a meaningful source 

of carbon for the de novo biosynthesis of the free fatty acids palmitate 

(b) and stearate (c) in two PDAC cell lines. The sum of the isotopomers 
from these data are presented in Fig. 2g, h. d-g, The addition of alanine 
to PDAC cells labelled with U-'C-glucose reduces glucose carbon 
incorporation into palmitate in 8988T (d) and Tu8902 (e) cells as well as 
into stearate in both 8988T (f) and Tu8902 (g) PDAC cells. This is shown 
by a decrease in highly enriched species (M14, M16 for palmitate and 
M16, M18 for stearate) and an increase in less enriched species 

(M6, M8, M10 for palmitate and M6, M8, M10, M12 for stearate). 

h, U-°C-glucose tracing studies illustrate that alanine drives glucose 
carbon into the serine biosynthetic pathway, as demonstrated 

by significant increases in fully labelled (M3), glucose-derived 
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3-phosphoglycerate (3PG), 3-phosphoserine (p-Ser), serine (Ser), and 
M2 glycine. A t-test was performed; *P = 0.0355 for 3PG (Ala) versus 
control; ***P = 0.0041 for p-Ser (Ala) versus control; *P = 0.0172 for Ser 
(Ala) versus control; *P = 0.0123 for Gly (Ala) versus control. 

i, j, Alanine increases serine biosynthetic pathway activity, as seen for 
changes in the precursors 3PG and p-Ser. This effect is enhanced in cells 
grown under low glucose (0.5mM) conditions (j). A t-test was performed; 
i, *P=0.0355, ***P = 0.0041; j, ***P =0.0005 for 3PG Ala versus mock, 
P<0.0001 for pSer Ala versus mock. k, 1, Alanine alters the contribution 
of glutamine carbon to the TCA cycle as seen by changes in the U-!3C- 
Gln-derived fractional labelling of the TCA metabolites citrate, isocitrate, 
fumarate and malate upon addition of 1 mM alanine in 8988T (k) and 
Tu8902 (1) PDAC cell lines. Data are presented as relative metabolite for 

a, per cent labelling for b-g, k, 1, and total ion currents for h-j. Error bars 
represent the s.d. of n =3 (a, h-l) and n=4 (b-g) technical replicates from 
independently prepared samples from individual wells. The raw data for 
k, lare presented in Supplementary Information Fig. 2k, lL. 
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Extended Data Figure 7 | See next page for caption. 
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Extended Data Figure 7 | Stellate cell autophagy is required to support 
PDAC cell metabolism through the secretion of alanine. a, PSCs 
immunostained for LC3-II display basal autophagy in standard culture 
conditions as shown by the presence of autophagosomes represented 

by LC3 puncta (green). Nuclei are counterstained with DAPI (blue). 

b, Representative images of autophagic puncta in control (shGFP) or 
autophagy impaired (shATGS5 and shATG7) PSCs using an LC3 tandem 
fluorescence (GFP-REFP) reporter. Knockdown of ATG5 or ATG7 
significantly decreases autophagosome formation. c, Quantification of 
autophagy in PSCs. Error bars represent the s.e.m of n= 13 for shGFP; 
n= 12 for shATG5#1, shATG7#1 and #2; n= 10 for shATG5#2. Two-way 
ANOVA: **P =0.0067 for shGFP versus shATG5#1, ***P = 0.0009 for 
shGFP versus shATG5#2, ***P < 0.0001 for shGFP versus shATG7#1 or 
#2. d, PDAC-conditioned medium increases autophagy in hPSC cells, as 
determined using an LC3 tandem fluorescence (GFP-RFP) reporter. The 
relative abundance of autolysosomes and autophagosomes (red and yellow 
puncta, respectively) is a measure of flux; data are quantified in Fig. 3c. 

e, f, Western blot demonstrating knockdown of ATGS5 and ATG7 using 
two independent shRNAs in hPSC#1 (e) and hPSC#2 (f). A decrease in 
autophagy is shown by a decrease in LC3-II (lower band). g, Suppression 
of PSC autophagy by ATGS5 or ATG7 knockdown attenuates the ability of 
PSC- (hPSC#2)-conditioned medium to increase PDAC OCR. Error bars 
represent the s.d. of quadruplicate wells from a representative experiment 
(of 3 experiments). One-way ANOVA; ***P = 0.0006. h, Suppression of 
PSC autophagy by ATGS5 or ATG7 knockdown attenuates the ability of 
PSC- (hPSC#1)-conditioned medium to increase PDAC OCR in serum- 
free conditions and this phenotype can be rescued by addition of 1mM 
exogenous alanine. Data are normalized to cells treated with serum- 

free DMEM. Error bars represent the s.e.m of n= 4 for 8988T, 1mM 

Ala, shGFP groups; n = 3 for shATG5#2, shATG7#1, shATG5#2 + Ala, 
shATG7#1 + Ala and shGFP + Ala groups in independent experiments. 
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One-way ANOVA; ***P = 0.0004 for control versus 1mM Ala, P=0.0003 
for control versus hPSC, P < 0.0001 for control versus hPSC shATG5 #2 

+ Ala, P=0.0002 for control versus hPSC shATG7 #1 + Ala, P< 0.0001 
for control versus hPSC + Ala. i, ATG5 or ATG7 knockdown in hPSCs 
decreases intracellular alanine concentrations compared to shGFP 
controls. Error bars, s.d. of n =3 technical replicates from independently 
prepared samples from individual wells. One-way ANOVA: *** P= 0.0020 
for hPSC-shGFP versus hPSC-shATG5; P= 0.0040 for hPSC-shGFP 
versus hPSC-shATG7. j, In serum-free conditions, ATG5 or ATG7 
knockdown in hPSCs also decreases the secretion of alanine relative to 
shGFP controls. Error bars represent s.d. of m = 3 technical replicates from 
independently prepared samples from individual wells. t-test; 

* P—(),0428 for hPSC versus hPSC-shATG5, P= 0.0477 for hPSC versus 
hPSC-shATG7. k-m, Autophagy inhibition by ATG5 or ATG7 knockdown 
(e, f, 1) or chloroquine (CQ) treatment (101M) decreases alanine levels in 
conditioned medium from hPSC#2 (k) and mouse PSC (m) as compared 
to shGFP or mock-treated controls. Error bars represent s.d. of n =3 
technical replicates from independently prepared samples from individual 
wells. t-test; k, *P= 0.0213 for shGFP versus shATG7, P= 0.02061 for 
shGFP versus CQ; m, ***P < 0.0001 for shGFP versus shATG5, P= 0.0003 
for shGFP versus shATG7, P=0.0001 for shGFP versus CQ. n, Autophagy 
inhibition has a modest impact on PSC proliferation. Data are plotted as 
relative cell proliferation in arbitrary units (a.u.). Error bars, s.d. of 

4 independent wells from a representative experiment 

(of 4 experiments). One-way ANOVA; P values for the last time point 

are as follows: *P =0.0103 for shGFP versus shATG5#1, *P= 0.0124 for 
shGFP vs shATG5#2, P= 0.6657 for shGFP versus shATG7#1, 

*** P< (0001 for shGFP vs shATG7#2. 0, p, Autophagy inhibition does 
not significantly impact hPSC (0) or mPSC (p) viability following growth 
for 48h in serum-free conditions, as shown by a trypan-blue exclusion 
assay. Error bars represent s.e.m of n = 3 independent experiments. 
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Extended Data Figure 8 | See next page for caption. 
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Extended Data Figure 8 | Stellate cell metabolite secretion can 
support PDAC growth under nutrient-limiting conditions. a, hPSC- 
conditioned medium does not significantly affect proliferation of 8988T 


cells grown in complete medium (25 mM glucose, 4mM Gln, 10% serum). 


Alanine supplementation contributes modestly to PDAC growth in 
complete medium. Error bars represent the s.d. of 6 replicate wells from 
a representative experiment (of 4 experiments). One-way ANOVA; 

** P — (0.0079. b, c, hPSC #2 or hPSC #3-conditioned medium can 
sustain in vitro proliferation over 48h for 8988T (b) and Tu8902 (c) cells, 
as compared to PDAC-conditioned medium. Data are normalized to 
growth in serum-free DMEM and complete serum medium is included 
as a positive control. Error bars represent s.e.m of n = 4 experiments for 
b and s.d. of quadruplicate wells from a representative experiment (of 3 
experiments) for c. One-way ANOVA; b, ***P < 0.0001; ¢, **P= 0.0091, 
** P< 0.0001. d, hPSC-conditioned medium facilitates proliferation 

of 8988T cells grown in low glucose (0.5 mM) medium. Conditioned 
medium from hPSCs in which autophagy is inhibited loses the ability 

to support PDAC growth. Supplementation of medium with exogenous 
alanine rescues PDAC proliferation under low glucose conditions in a 
manner akin to the positive control, 10 mM glucose-containing medium. 
Error bars represent the s.e.m of 4 independent experiments. One-way 
ANOVA; *P= 0.0420 for GFP versus control, P= 0.0158 for complete 
medium versus control; **P = 0.0054. e-g, hPSC#1-conditioned medium 
increases PDAC proliferation in MiaPaCa2 (e), Tu8902 (f) and 8988T (g) 
cells in serum-free medium. Conditioned medium from hPSCs in which 
autophagy is inhibited is less effective at maintaining PDAC proliferation. 
Alanine increases PDAC proliferation under serum-free conditions, and 
it can rescue the proliferation-inducing capacity of autophagy-deficient 
hPSC-conditioned medium (g). The addition of 10% serum is included 
as a positive control. Error bars represent the s.e.m of 8 independent 
experiments for e, f and 3 independent experiments for g. One-way 
ANOVA; e, ***P=0.0011 for shGFP versus control, ***P=0.0013 
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for 1mM alanine versus control, ***P= 0.0001 for complete medium 
versus control; f, ****P =0.0007 for shGFP versus control, **P = 0.0042 
for alanine versus control, ***P < 0.0001 for complete medium versus 
control; g, ***P < 0.0001. h, i, Mouse PSC-conditioned medium increases 
proliferation in 8988T (h) and MiaPaCa2 (i) PDAC cell lines over 48h 

in serum-free conditions. This effect is impaired when the conditioned 
medium is collected from mouse PSCs in which autophagy is inhibited. 
Error bars represent s.e.m of n=3 experiments. One-way ANOVA; 

h, ***P =0.0002 for shGFP versus control, ***P < 0.0001 for complete 
medium versus control; i, *P= 0.0108, ***P = 0.0002. j-1, PDAC 
proliferation under low glucose (0.5 mM) conditions can be rescued by 

1 mM alanine or pyruvate, but not by 1 mM lactate in 8988T (j), Tu8902 
(k) and MiaPaCa2 (1) PDAC cell lines. Error bars represent the s.d. of 
quadruplicate wells from a representative experiment (of 4 experiments). 
One-way ANOVA; j, ***P= 0.0013, **P=0.0085 for 1mM alanine versus 
control, P=0.0070 for 10 mM glucose versus control; k, *P = 0.0323 for 

1 mM alanine versus control, ***P = 0.0044 for 1 mM pyruvate versus 
control, ***P=0.0010 for 10mM glucose versus control; 1, **P = 0.0060 
for 1mM alanine versus control, ***P = 0.0008 for 1 mM pyruvate versus 
control, ***P = 0.0018 for 10 mM glucose media versus control. m, 8988T 
PDAC cells depleted for GPT1 and grown in serum-free medium do not 
proliferate in response to hPSC-conditioned medium or alanine, relative 
to control shGFP or GPT2-depleted cells. Data are represented as fold- 
change over 48h and complete medium is included as a positive control. 
Error bars represent s.e.m. from 3 independent experiments. Two-way 
ANOVA; for shGEP, ***P < 0.0001 for control versus hPSC#1, 

«+k D — ().0060 for control versus alanine, ***P < 0.0001 for control 
versus complete medium; for shGPT1 #1, ***P = 0.0005; for shGPT1 #2, 
*** DP < 0.0001, *P = 0.0493; for shGPT2 #1, ***P < 0.0001; 

for shGPT2 #2, ***P = 0.0004 for control versus hPSC#1, ***P < 0.0001 
for others. 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


LETTER 


a b 


+hPSC #1-shGFP 


8988T 


T T 
0 20 40 60 80 


Days post-injection 


8988T _ 8988T 
+hPSC #1-shATG5 


8988T — 8988T 
+hPSC #1-shATG7 


MiaPaca2 


Trichrome 


300 
on 250 
E200 
E- 150 
ot 
EG 100 
st 
oc 
ee 
i= 
=] a 
Zz 50 
£ 
=) 
= 
0 
e f 
MiaPaca2 
oO 
£ 
E. 
eo 
af 
Oc 
> § 
se 
re) 
£ 
=) 
= 


0) 20 40 60 
Days post-injection 
— MiaPaCa2 — _ MiaPaCa2 
+hPSC #1-shATG5 


—  MiaPaCa2 —  MiaPaCa2 
+hPSC #1-shGFP +hPSC #1-shATG7 


MiaPaca2 +hPSC #1 MiaPaca2+hPSC #1 MiaPaca2 + hPSC #1 


-mCherry-shGFP 


fee oe SAN 
, 


-mCherry-shATG5 


-mCherry-shATG7 


Extended Data Figure 9 | See next page for caption. 
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Extended Data Figure 9 | Subcutaneous PDAC xenograft tumour 
growth is supported by autophagy-competent pancreatic stellate cells. 
a, Immunoblot for ATG5 and ATG7 knock-down in hPSC#1 cells infected 
before subcutaneous co-injection. LC3-II shows the inhibition 

of autophagy in these cells. b, Tumour growth is enhanced following 
co-injection of hPSCs with 8988T PDAC cells. This affect is significantly 
attenuated when autophagy is suppressed in the hPSCs during the initial 
phases of tumour growth. Error bars represent the s.e.m for 10 tumours 
per condition at each time point. t-tests were performed for each time 
point; *P < 0.05. c-e, Co-injection of MiaPaCa-2 PDAC cells with 

PSCs significantly enhances early tumour growth, analysed at 25 days 
post-injection (c) and decreases tumour-free survival (d). This effect is 
significantly attenuated when autophagy is suppressed in the PSCs. 

e, Tumour growth kinetics. Error bars represent the s.e.m. of 10 tumours 
per condition per time point, except for the PSC-shGFP control, for which 
only 5 animals were injected. c, One-way ANOVA, **P =0.0099, 

** P — 0.0020; d, log-rank Mantel-Cox test, *P = 0.0450, **P = 0.00174, 
* P< 0.0001; e, t-tests for each time point, *P < 0.05. f, Representative 
sections for endpoint analysis from tumours for each experimental group 


LETTER 


stained with trichrome (top, blue) or a-smooth muscle actin (aSMA) 
(bottom, brown). Minimal residual intra-tumour collagen deposition and 
stromal content remains at endpoint. The asterisk in the aSMA staining 
images indicates a vessel and serves as a positive control. gi, Early time- 
point analysis of MiaPaCa2 cells co-injected with RFP-labelled hPSCs 

in nude mouse flanks; tumours removed at 2 weeks post-injection. 

g, Representative sections of tumours stained with mCherry, a lineage label 
for the injected PSCs, illustrating minimal stromal content even at early 
time points. h, RFP immunoblot quantification as a marker of remaining 
RFP-labelled stellate cells injected. Four tumours originating from 
MiaPaCa2 cells co-injected with control shGFP-hPSCs, three tumours 
from MiaPaCa2 cells co-injected with shATG5-hPSCs and two tumours 
from MiaPaCa2 cells co-injected with shATG7-hPSCs were quantified. 
Error bars represent the s.d. of 2-4 lanes quantified per condition, as 
indicated. t-test; n.s., P=0.5982 for shGFP versus shATG5, P > 0.9999 

for shGFP versus shATG7. i, RFP immunoblot as a marker of remaining 
RFP-labelled stellate cells injected. Quantification shown in h. The full blot 
containing all nine samples is presented in Supplementary Fig. le. 
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Extended Data Figure 10 | See next page for caption. 
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Extended Data Figure 10 | Orthotopic PDAC xenograft tumour growth 
is supported by autophagy-competent pancreatic stellate cells. 

a, Representative high-resolution ultrasound images of the pancreata of 
nude mice 4 weeks after intra-pancreatic injection with MiaPaCa2 cells 
alone or with control shGFP or autophagy-impaired shATG5 or shATG7 
hPSCs. hPSC-only injections are included as negative controls. Skin and 
spleen (Sp) are indicated as spatial references; tumours are outlined in red; 
prospective tumours that did not meet threshold at the time of imaging are 
outlined in orange. b, c, Co-injection of MiaPaCa-2 PDAC cells with PSCs 
in the pancreata of nude mice significantly enhances early tumour growth 
at 21 days post-injection (b) and decreases tumour-free survival (c) in this 
orthotopic xenograft model. Again, this effect is significantly attenuated 
when autophagy is suppressed in the PSCs. Error bars, s.e.m. of 5 tumours 
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per condition per time point. b, One-way ANOVA; * P= 0.0283, 

*** P — 0.0010; c, log-rank Mantel-Cox test; *P = 0.0278 for MiaPaCa2 

+ hPSC#1-shGFP versus MiaPaCa2 + hPSC#1-shATG5, *P=0.0288, 
*** P — 0.0017 for MiaPaCa2 + hPSC#1-shGFP versus MiaPaCa2. 

d, Tumour growth kinetics following co-injection of MiaPaCa2 PDAC cells 
with hPSCs show enhanced tumour growth. This affect is significantly 
attenuated when autophagy is suppressed in the hPSCs during the initial 
phases of tumour growth. Error bars represent the s.e.m. of tumours from 
5 animals measured per condition at each time point. t-tests for each time 
point; *P < 0.05. e, Representative endpoint sections of tumours 5 weeks 
post-injection stained with aSMA, a marker of PSCs, and the lineage label 
mCherry. Together, these stains illustrate that minimal stromal content 
remains at endpoint. 
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Diverse activation pathways in class A GPCRs 
converge near the G-protein-coupling region 


A.J. Venkatakrishnan!?*+, Xavier Deupi’, Guillaume Lebon®”’, Franziska M. Heydenreich>?°, Tilman Flock!, Tamara Miljus*”, 
Santhanam Balaji!, Michel Bouvier!, Dmitry B. Veprintsev®°, Christopher G. Tate!, Gebhard FE. X. Schertler®»? & M. Madan Babu! 


Class A G-protein-coupled receptors (GPCRs) are a large 
family of membrane proteins that mediate a wide variety of 
physiological functions, including vision, neurotransmission 
and immune responses!*. They are the targets of nearly one- 
third of all prescribed medicinal drugs” such as beta blockers and 
antipsychotics. GPCR activation is facilitated by extracellular 
ligands and leads to the recruitment of intracellular G proteins*®. 
Structural rearrangements of residue contacts in the transmembrane 
domain serve as ‘activation pathways’ that connect the ligand- 
binding pocket to the G-protein-coupling region within the receptor. 
In order to investigate the similarities in activation pathways across 
class A GPCRs, we analysed 27 GPCRs from diverse subgroups for 
which structures of active, inactive or both states were available. 
Here we show that, despite the diversity in activation pathways 
between receptors, the pathways converge near the G-protein- 
coupling region. This convergence is mediated by a highly 
conserved structural rearrangement of residue contacts between 
transmembrane helices 3, 6 and 7 that releases G-protein-contacting 
residues. The convergence of activation pathways may explain how 
the activation steps initiated by diverse ligands enable GPCRs to 
bind a common repertoire of G proteins. 

Although GPCRs have undergone extensive sequence diversification, 
they share a conserved structural architecture with seven transmem- 
brane a-helices. Binding of a ligand to the receptor from the extracellular 
side of the cell membrane influences the rearrangement of residue 
contacts between the transmembrane helices*’. This leads to extensive 
conformational changes on the cytoplasmic side of the membrane, 
thereby facilitating receptor activation® '°. The activated receptor binds 
to G proteins and allosterically activates them!*!”, In this manner, the 
GPCR transmembrane region enables the transmission of signals across 
the cell membrane. At the time of writing, crystal structures have been 
determined for 27 class A GPCRs, belonging to seven of the eleven 
different subgroups (which are defined by the endogenous ligand types 
they bind; GPCRdb’; http://www.gpcrdb.org). Whereas most struc- 
tures correspond to inactive states (those bound to an antagonist or 
inverse agonist), some structures of receptors in active states (bound to 
an agonist and with structural rearrangements at the G-protein-binding 
region) have also been solved. There are five receptors for which the 
structures of both inactive and active (that is, active or active-inter- 
mediate) states are available: light-activated rhodopsin*®}3, the amine- 
activated 8,-adrenergic receptor (8,AR)’, the M2 muscarinic recep- 
tor (M2R)", the nucleoside-activated Aza receptor (A2,R)!” and the 
peptide-activated 1-opioid receptor (\OR)!”. The remaining structures 
have been determined only in either their inactive or active states. The 
availability of GPCR structures from divergent subgroups (some with a 
sequence identity as low as around 20%") that are bound to chemically 


diverse ligands, and known to couple to different G proteins, allowed 
us to investigate activation pathways across class A GPCRs. 

As GPCRs are structurally similar and activate a small set of G pro- 
teins, some structural aspects of receptor activation, such as contraction 
of the ligand-binding site or opening of the cytosolic side owing to relo- 
cation of transmembrane helix 6 (TM6), are broadly similar. However, 
receptor activation is mediated by diverse ligands and therefore 
some aspects of ligand-associated GPCR activation must necessarily 
be receptor-specific. To assess the similarity of activation pathways 
across different receptors, we carried out a comprehensive comparison 
of residue contacts of inactive and active state structures. Structural 
equivalence for residues across the different GPCRs was assigned using 
the GPCR database numbering scheme” from GPCRdb’ (further 
details in Methods). A contact between a pair of residues is said to exist 
if the interatomic distance between any two atoms across the residue 
pair is shorter than the sum of their van der Waals radii plus a cut-off 
distance*" (Fig. 1a; further details in Methods). Analysis of residue 
contacts in the inactive- and active-state structures of rhodopsin, 
B2AR, M2R, Az4R and OR revealed that there are 220-266 contacts 
between residues in the seven transmembrane helices and helix 8 for 
each receptor (Fig. 1b). In every receptor, roughly half of the residue 
contacts are reorganized upon activation. 

The combinatorial possibilities of structural comparisons of multiple 
receptors make it difficult to determine the similarities in contacts that 
change upon activation among different receptors. Therefore, we used 
an unbiased, multi-way comparison approach in which we analysed 
the pattern of contacts between structurally equivalent residues across 
all inactive and active state structures (termed the contact fingerprint; 
Fig. 2a). Of the 451 contact fingerprints, only 30 represent contacts that 
are maintained consistently in all inactive- and/or active-state structures. 
The remaining 421 contact fingerprints represent contacts that are not 
maintained consistently in the inactive- or active-state structures across 
the five GPCRs. This suggests that there is marked diversity in the 
activation pathways of the different receptors (Fig. 2b). 

Despite the diversity in the reorganization of residue contacts upon 
receptor activation, four contacts involving seven residues are exclu- 
sively maintained in all inactive-state structures and two contacts 
involving four residues are maintained exclusively in all active-state 
structures (Fig. 3 and Supplementary Table 1). When the roles of 
these contacts in the inactive- and active-state structures are consid- 
ered together, a common re-organization of contacts upon activation 
becomes apparent, involving residue positions 3x46 in TM3, 6x37 in 
TM6 and 7x53 in TM7 (notation represents the GPCRdb numbering 
scheme) (Fig. 3). These contacts are proximal to the G protein- 
coupling region and therefore distant from the ligand-binding pocket. 
This finding highlights that the diverse structural changes among the 
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Figure 1 | Comparison of residue contacts in inactive- and active-state 
structures of class A GPCRs. a, Residue contact in a GPCR. Residues 
are denoted as circles and the non-covalent contacts between residues 
(residue contact) are denoted as lines connecting the circles. b, Similarity 


receptors, which are stabilized by a range of different ligands, con- 
verge on a common set of rearrangements near the G-protein-coupling 
region. 

Upon receptor activation, the cytoplasmic side of TM6 moves away 
from the rest of the transmembrane bundle, making several previously 
inaccessible G-protein-coupling residues accessible’. Many of these 
residues participate in triggering the conserved allosteric mechanism 
for GDP release in G proteins!" These include position 6x37 in the 
cytoplasmic end of TM6, which, upon activation, contacts a univer- 
sally conserved leucine (position G.H5.25; notation taken from the 
Common G protein numbering scheme”) of the G protein in the 
B.AR-G, structure (Fig. 4a). In its inactive state, the residue at 6x37 is 
engaged in a residue contact with a conserved hydrophobic residue at 
position 3x46 (Fig. 4a). Upon activation, the residue at 3x46 breaks the 
contact with the one at 6x37 and forms a new contact with a tyrosine 
residue, Tyr7x53, within the highly conserved NPXXY motif of TM7. In 
the inactive state, Tyr7x53 is not available to engage with 3x46 because 
TM7 and TM3 are far apart and require TM6 to move out in order to 
form a contact. The residues at position 8x50 and 1x53 contact the 
residue at 7x53 in the inactive state, and both contacts are lost upon 
receptor activation. This contact rearrangement is observed consist- 
ently in each of the five GPCRs examined, even though they are from 
different GPCR subgroups, bind to different ligands and couple to 
different G proteins. Thus, despite the diversity in contact reorganization 
upon activation (Fig. 2b), there exists a common rearrangement of 
residue contacts near the G-protein-coupling site that underlies the 
activation of diverse class A GPCRs (Fig. 4a). 
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Figure 2 | Patterns of residue contacts across inactive and active 
state structures of GPCRs. a, For every residue contact, the presence 
(or absence) of a contact between structurally equivalent residues is 
computed for all structures. The pattern of presence and absence (filled 
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Comparisons of the number of residue contacts between inactive and 
active states are shown for five class A GPCRs using Venn diagrams. 


To assess the consistency of the contacts between residues 3x46, 6x37 
and 7x53, we investigated all the other class A GPCR structures that 
were available only in either the inactive or activate state (further details 
in Methods). These include functionally diverse receptors belonging to 
seven of the eleven subgroups of class A GPCRs: light-activated opsins, 
aminergic receptors, peptide-binding receptors, protein-binding recep- 
tors, nucleoside-binding receptors, lipid-binding receptors, and others 
(Fig. 4b). The contact between 3x46 and 6x37 is present consistently 
in all inactive state structures but in none of the active state structures 
(Fig. 4b). Similarly, the contact between 3x46 and 7x53 is present con- 
sistently in all active-state structures but in none of the inactive-state 
structures (Fig. 4b). To investigate the importance of these residue 
positions and contacts for class A GPCRs (including in those that lack 
a solved structure), we performed a sequence analysis of all human 
non-olfactory class A GPCRs. We found that the residues that medi- 
ate contacts tend to be predominantly large hydrophobic or aromatic 
residues and therefore are likely to fulfil the distance criterion for 
contact formation (Extended Data Fig. 1). Together, these observations 
suggest that the rearrangement of contacts involving positions 3x46, 
6x37 and 7x53 is likely to be conserved and important for receptor 
activation in all class A GPCRs. 

In order to investigate experimentally the importance of the inter- 
actions between 3x46, 6x37 and 7x53 in a receptor for which there 
is no crystal structure available, we examined the signalling profile 
of the V2 vasopressin receptor using BRET signalling assays*!?*. We 
created single-, double- and triple-alanine mutants at these positions 
and measured vasopressin-induced G, and G, activation to assess 
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Figure 3 | Conserved rearrangement of residue contacts between 
inactive and active state structures of GPCRs. Considering the conserved 
residue contacts in the inactive (left) and the active (right) states together 
reveals a conserved rearrangement of residue contacts in all the five 
receptors. The cartoons show schematic representations of the contacts at 
the secondary structure level (inter-helical) and the residue level 


receptor function. The mutations resulted in markedly reduced ability 
of the receptor to activate G, and Gg when compared to the wild-type 
receptor (Extended Data Fig. 2). Previous observations from 
mutagenesis experiments in other GPCRs also highlight the impor- 
tance of these positions for signalling (Supplementary Table 2). For 
instance, mutation of 6x37 in a,)-AR results in a reduction of more 
than 70% in downstream signalling”’. Similarly, mutation of Tyr7x53 
in 3 AR leads to a significant reduction in G protein activation™*. The 
nature of the substitution is likely to influence how the receptor is 
affected”®. Given the key role of these residues, a mutation in one of 
these positions is likely to disrupt functional integrity, which could 
lead to a pathological condition in a physiological context. Indeed, 
mutations at these positions in different GPCRs are associated with 
disease (an example being hypogonadotropic hypogonadism”®; more 
can be found in Supplementary Table 3). Given the importance of 
positions 3x46, 6x37 and 7x53 in diverse GPCRs, we anticipate that 
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knowledge about the conserved rearrangement of contacts will guide 
GPCR modelling and simulations, facilitate engineering of GPCRs 
and help researchers to identify disease-causing mutations in other 
receptors. 

Several studies have provided insights into distinct mechanisms of 
activation in individual receptors’), and have described networks of 
contacts that are present in various combinations in different receptors 
upon activation*!®”8, Here we report a highly conserved rearrange- 
ment of specific residue contacts that functions as a common step in the 
activation pathways of diverse class A GPCRs. Because the microenvi- 
ronment (that is, the surrounding residues and second shell residues) in 
which this rearrangement takes place diverges among receptor families, 
the detailed mechanism by which this common step is facilitated is 
likely to be distinct for different sets of GPCRs. Despite such differ- 
ences, we find that the activation pathways ultimately converge onto a 
conserved, specific set of contact rearrangements between topologically 
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Figure 4 | Conserved rearrangement of residue contacts between 
inactive- and active-state structures across diverse class A GPCRs. 

a, Illustration of the conserved rearrangement of residue contacts between 
inactive- and active-state structures in 8,AR. Dotted circles around 
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6x37 and 7x53 denote the movement of TM6 and TM7 upon activation. 
b, Conservation pattern of residue contacts involved in the contact 
rearrangement upon receptor activation in the inactive- (left) and active- 
state (right) structures of class A GPCRs. 
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equivalent residues near the G-protein-coupling region. Future studies 
aimed at investigating residues at and around these positions will help 
to uncover the steps that lead to the triggering of this convergent reor- 
ganization in a receptor- and ligand-specific manner. 

From an evolutionary perspective, uncoupling the structural changes 
near the ligand-binding pocket from the G-protein-coupling region 
permits the ligand-binding pocket to evolve independently of the 
intracellular region that couples to G proteins, whilst ensuring that the 
activation pathways converge near the G-protein-coupling site. In this 
manner, the convergence of ligand-mediated allosteric activation path- 
ways may have contributed to the evolutionary success of GPCRs: their 
ability to bind and be activated by a plethora of ligands, but to signal 
intracellularly via a small repertoire of signalling proteins inside the cell. 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. The experiments 
were not randomized. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

Data set. The coordinates of the crystal structures of 27 GPCRs were obtained from 
the Protein Data Bank”? (PDB; http://www.rcsb.org). The structures were classi- 
fied into either inactive or active (active/active-intermediate) states. An inactive- 
state structure is one that is bound to an antagonist or an inverse agonist. An 
active-state structure is one bound to an agonist and has undergone structural 
rearrangements in the G-protein-coupling region in comparison to its inactive 
state structure. 

There are inactive-state structures for the following GPCRs: rhodopsin (PDB 
accession code 1GZM), invertebrate rhodopsin (2Z73), 3)-adrenergic receptor 
(2VT4), 82-adrenergic receptor (2RH1), D3 dopamine receptor (3PBL), H1 
histamine receptor (3RZE), M2 muscarinic receptor (3UON), M3 muscarinic 
receptor (4DAJ), CXKCR4 chemokine receptor (30DU), CCR5 chemokine receptor 
(4MBS), k-opioid receptor (4DJH), 1-opioid receptor (4DKL), nociceptin receptor 
(4EA3), 5-opioid receptor (4EJ4), orexin 2 receptor (4SOV), angiotensin II type 
1 receptor (4YAY), protease activated receptor 1 (3VW7), sphingosine 1P recep- 
tor (3V2Y), lysophosphatidic acid receptor 1 (4Z36), adenosine A2A receptor 
(3EML), and P2Y, receptor (4XNV). 

There are active-state structures for the following GPCRs: rhodopsin bound to 
a peptide resembling the C terminus of transducin (PDB accession code 3PQR) 
or bound to arrestin (4ZWJ), adenosine A2A receptor (2YDV), 82-adrenergic 
receptor bound to the G protein Gs (3SN6), M2 muscarinic receptor bound to 
a G protein mimetic nanobody (4MQS), j1-opioid receptor bound to a G pro- 
tein mimetic nanobody (5C1M), and viral US28 bound to a G protein mimetic 
nanobody (4XT1). 

There are agonist-bound structures of four GPCRs: 3-adrenergic receptor 
(PDB accession code 2Y01), serotonin 1B receptor (4IAQ), serotonin 2B receptor 
(4IB4) and neurotensin 1 receptor (4GRV). The 3)-adrenergic receptor structure 
(2Y01) is nearly identical to its inactive state structure in the G-protein-coupling 
region and is in an agonist-bound-inactive conformation. Since there are no 
antagonist- or inverse-agonist-bound structures of serotonin 1B receptor, ser- 
otonin 2B receptor and neurotensin 1 receptor, their agonist-bound structures 
(41AQ, 41B4, and 4GRV) cannot be reliably classified as in an inactive state or an 
active state. As a result these structures were excluded. 

There are three structures of the P2Y12 receptor (PDB accession codes 4NT], 
4PXZ and 4PY0). 4PXZ and 4PY0 are similar to each other, and have a distorted 
TM6 (in the cytoplasmic side) owing to the fusion with BRIL (the helix ‘ends’ 
right after 6x36, becoming a short unstructured region that links to BRIL). 4NTJ 
is also unusual as the fusion protein still distorts the cytoplasmic region near 6x36 
but the ligand binds in a peculiar pose that forces TM6 to move away from the 
transmembrane bundle at the extracellular side. This modifies the rest of the helix, 
including the cytoplasmic side. This peculiar binding pose renders most of the 
extracellular domain invisible, including the conserved cysteine bridge between 
TM3 and extracellular loop 2. Thus, due to the specific fusion splice points (in 
ANT], 4PXZ and 4PY0) and the peculiar binding mode of the ligand (in 4NTJ), 
the cytoplasmic side of TM6 in the P2Y12 receptor crystal structures appears 
distorted and, therefore, these structures were excluded. Similarly, TM7 of FFAR1 
(4PHU) is shorter than in other class A GPCRs and appears to be truncated so 
this receptor was also excluded. 

Calculation of residue contacts. We defined a residue contact between a pair of 
residues as present when the distance between any two atoms from the residue 
pair is less than the sum of their van der Waals radii plus a cut-off distance of 
0.5 A. Contacts involving the backbone atoms between residues that are less than 
four amino acids apart in the protein sequence were ignored to exclude local 
contacts*!*, We only considered contacts mediated by the residues of TM1-7 
or the amphipathic helix 8 (H8) (based on the GPCRdb numbering scheme), as 
ligand binding primarily leads to reorganization in the transmembrane bundle, 
and as the loops and termini are highly variable in length, structure and intrinsic 
disorder among the different GPCRs*”. While rearrangements of water-mediated 
contacts have been shown to be important for receptor activation!®*", they are 
not considered here because not all structures have water molecules resolved in 
their coordinates. In each structure, residues present in the TM1-7 or in H8 are 
treated as nodes, and the presence of atomic contacts between pairs of residues 
are treated as edges connecting the nodes. We then evaluated the presence of 
residue contacts between topologically equivalent residues across GPCR struc- 
tures. Structural equivalence was assigned using the GPCRdb numbering 
scheme” from GPCRdb’’. 

Contact fingerprinting. The functional importance of a given residue con- 
tact across a group of proteins can be estimated based on the extent to which 


topologically equivalent contacts are maintained consistently*!*. For every 
residue contact within the ensemble, the presence or absence of an equiva- 
lent residue contact between topologically equivalent positions across the 
rest of the GPCR structures was recorded. This information is stored as a 
bit string of ones (present) and zeros (absent), which are referred to here as 
‘contact fingerprints. Identifying contact fingerprints that represent consist- 
ently maintained residue contacts across and between conformational states 
enabled us to identify the conserved rearrangements of residue contacts in 
class A GPCRs. For the identification of residue contacts that are specific 
either to the inactive or the active states, we focused only on the GPCRs that 
had structures determined in both the inactive as well as active state. For the 
analysis of contacts across GPCRs including the inactive- and active-state- 
only structures (Fig. 4b), the contact distance threshold was computed based 
on the sum of the van der Waals radii of atoms plus multiple cut-off distance 
between 0.5 A and 0.8 A, in order to account for variation in the quality of 
structures. 

Sequence analysis. The alignment of 311 non-olfactory, class A human GPCRs 
were obtained from the GPCRdb’*. A total of 285 positions, spanning all seven 
transmembrane segments and H8, were aligned. The percentage presence of 
amino acids for residues at positions 3x46, 6x37 and 7x53 was calculated from 
this alignment. The set of large hydrophobic or aromatic residues was defined to 
include the following amino acids: Val, Leu, Ile, Met, Phe, Tyr, and Trp. The odds 
of finding large hydrophobic or aromatic residues at both positions that form a 
contact across class A GPCRs was calculated as P/(1 — P); where P= probability 
of large hydrophobic or aromatic residues in both positions that mediate a 
contact. 

Structure analysis and visualization. Computer scripts used for identifying 
and comparing residue contacts in protein structures were written in Perl. For 
code availability, please contact the authors. Visualization of protein structures 
and residue interaction networks was performed using PyYMOL (The PyMOL 
Molecular Graphics System, Version 1.8 Schrodinger, LLC.). 

Signalling profile of Vasopressin V2 receptor mutants using BRET-based 
biosensors. Vasopressin V2 receptor construct used in the experiments. The 
human vasopressin V2 receptor construct included the N-terminal SNAP-tag 
for easy detection and quantification and Twin-Strep tag for optional purifica- 
tion in pcDNA4/TO vector (SNAP-TS-V2R). The mutations were introduced 
using a two-fragment PCR approach followed by Gibson assembly*” and con- 
firmed by sequencing. 

Vasopressin V2 receptor ligands. [Arg8]-Vasopressin (Cys-Tyr-Phe-Gln-Asn- 
Cys-Pro-Arg-Gly-NH); disulfide bridge: Cys1-Cys6) was purchased from 
Genemed Synthesis Inc. The ligand dilutions were prepared in PBS with 0.1% 
(w/v) BSA. Stock solutions were stored at —20°C and dilutions were freshly 
prepared for each experiment. 

Biosensor constructs. For the plasmids encoding the RlucII-Ga constructs, 
Renilla luciferase (RlucII) was inserted into the coding sequence of human 
Ga, and Gag constructs at positions 117 and 118, respectively. Gy1 was 
N-terminally tagged with GFP 10 as described”*. Untagged G1 was used for 
all experiments. 

Cell culture and transfection. Human embryonic kidney (HEK) 293SL cells 
were transiently co-transfected with SNAP-TS-V2R, different RlucII-Ga vari- 
ants, G31 and GFP10-G7‘1 for G protein activation measurements. HEK 293SL 
cells were obtained from the laboratory of Prof. Stephane Laporte, McGill 
University. They are clonal isolates from HEK293 cells originally obtained 
from ATCC. Morphology and gene expression profiles validate their identity. 
Mycoplasma testing is done on a routine basis for all cell and only cells free 
of Mycoplasma are used for experiments. Linear 25 kDa polyethylenimine 
(PEI) (Polysciences Inc.) was prepared in phosphate-buffered saline (PBS) 
(Multicell) (PEI:DNA ratio 3:1). The cells were seeded into white Cellstar PS 
96-well cell culture plates (Greiner Bio-One) at a density of 20,000 cells per 
well and grown for 48 h at 37 °C with 5% COp. 

Biosensor measurements. At 48h after transfection the 96-well plates were 
washed once with 200 il PBS per well and 901 of Tyrode’s buffer (NaCl 137 mM, 
KCl 0.9mM, MgCl; 1 mM, NaHCO; 11.9mM, NaH 2PO, 3.6 mM, HEPES 
25 mM, glucose 5.5 mM, CaCl; 1 mM, pH 7.4) were added for measurements. 
The cells were stored at 37°C with 5% CO, for 2h before the measurement. 
The plates were incubated with 1011 ligand or vehicle per well for 5 min, then 
the luciferase substrate, coelenterazine 400a (DeepBlueC), was added to a final 
concentration of 2.54.M. After a further 5 min of incubation, luminescence 
and GFP10 fluorescence were measured at 410 + 40 nm and 515nm + 15nm 
respectively, in a Synergy Neo plate reader (Biotek) using 0.4s integration time. 
BRET was expressed as a ratio of the emission at 515 nm over the emission at 
410nm. 
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Quantification of expression levels. SNAP-TS-V2R wild-type- and mutant- 
expressing HEK293SL cells were labelled with 5011 141M SNAP-Surface Alexa 
Fluor 647 dye (New England Biolabs Inc.) in complete medium (DMEM, 10% 
FBS, Penicillin-Streptomycin) for 30 min at 37°C, 5% CO) in 96-well plates. 
The wells were each washed with 100 1l of complete medium three times, 
and once with PBS. For measurements, 90 1] Tyrode’s buffer were added to 
each well. Fluorescence was measured with 630 nm excitation and 670 nm 
emission. 
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Extended Data Figure 1 | Sequence analysis of positions involved in the _ defined to include the following amino acids: Val, Leu, Ile, Met, Phe, Tyr, 
conserved rearrangement during receptor activation. The alignment of Trp. c, The percentage of amino acids occupying residue positions 3x46, 


311 non-olfactory class A human GPCRs were obtained from GPCRdb. 6x37 and 7x53 involved in the conserved rearrangement is shown. d, The 
Percentage of residue pairs making a contact in the inactive state only odds of finding a large hydrophobic or aromatic residue at a pair/triplet 
(a, orange spectrum; five bins) and the active state only (b, green of positions that form a contact during the conserved rearrangement. See 
spectrum; five bins). The set of large hydrophobic/aromatic residues was Methods for details. 
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Extended Data Figure 2 | Signalling profile of Vasopressin V2 receptor 
mutants using BRET-based biosensors. a, b, Activation of G, (a) and Gq 
(b) by wild-type and mutant vasopressin V2 receptors. The residues at 
positions 3x46, 6x37 and 7x53 were mutated to alanine separately and in 
combination and their G protein signalling response was measured using 
BRET-based biosensors. The data points represent the mean + SEM of 
n= 3-4 experiments. G, is the primary and G, the secondary cognate G 
protein of the vasopressin V2 receptor. For both G proteins, the activation 
is monitored by the decrease in BRET resulting from the separation 
between Ga and G8. The G protein response was markedly reduced for 
all mutants, pointing towards reduced receptor activity. The expression 
levels of the mutant receptors were quantified using a fluorescent dye 
bound to the SNAP tag of surface-expressed receptors. While the Y325A 
(Y7x53A), T273A (T6x37A) and T273A/Y325A (T6x37A/Y7x53A) 
proteins were expressed at wild-type levels, the expression of M133A 
(M3x46A) and its combinations showed reduced expression levels. 
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As a comparison, the signalling of reduced levels of wild-type vasopressin 
V2 receptor (% of DNA transfected) for G, and Gg is shown in ¢ and d, 
respectively. This indicates that the reduction in expression level alone 
cannot explain the reduced signalling of Y325A (Y7x53A), T273A 
(T6x37A) and the Y325A (Y7x53A)/T273A (T6x37A) combination. Thus, 
the reduced signalling activity of Y325A (Y7x53A), T273A (T6x37A) and 
T273A/Y325A (T6x37A/Y7x53A) is due to a change in the ability of the 
receptor to promote G protein activation. In the case of receptor mutants 
(single, double and triple) involving M133A (M3x46A), both the G protein 
response and the expression level of the receptor are markedly reduced. 
This may be because position 3x46 is involved in mediating contacts that 
are important for both the active and the inactive state, and mutating this 
position might simultaneously affect receptor biogenesis and downstream 
response. e, The increases in ECs (along with the decrease in maximal 
responses) are consistent with a reduced ability of the receptor to activate 
G proteins. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature18311 


Corrigendum: Cerebral cavernous 
malformations arise from 
endothelial gain of MEKK3- 
KLF2/4 signalling 


Zinan Zhou, Alan T. Tang, Weng- Yew Wong, Sharika Bamezai, 
Lauren M. Goddard, Robert Shenkar, Su Zhou, Jisheng Yang, 
Alexander C. Wright, Matthew Foley, J. Simon C. Arthur, 
Kevin J. Whitehead, Issam A. Awad, Dean Y. Li, 

Xiangjian Zheng & Mark L. Kahn 


Nature 532, 122-126 (2016); doi:10.1038/nature17178 


In this Letter we omitted to cite a relevant paper! showing that loss of 
cerebral cavernous malformation (CCM) signalling confers an increase 
in KLF2 expression in endothelial cells and in the developing zebrafish 
heart. We regret this oversight. 


1. Renz, M. et al. Regulation of 61 integrin-KIf2-mediated angiogenesis by CCM 
proteins. Dev. Cell 32, 181-190 (2015). 
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CORRECTIONS & AMENDMENTS 


ERRATUM 
doi:10.1038/nature18932 


Erratum: The bacterial DnaA- 
trio replication origin element 
specifies single-stranded DNA 
initiator binding 

Tomas T. Richardson, Omar Harran & Heath Murray 


Nature 534, 412-416 (2016); doi:10.1038/nature1 7962 


In Fig. 2c of this Letter, the rectangle at the bottom right corner was 
erroneously labelled ‘ADP’ rather than ‘ATP’; this has now been 
corrected online. 
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CORRECTIONS & AMENDMENTS 


RETRACTION 
doi:10.1038/nature18613 


Retraction: Odour receptors and 
neurons for DEET and new insect 


repellents 


Pinky Kain, Sean Michael Boyle, Sana Khalid Tharadra, 
Tom Guda, Christine Pham, Anupama Dahanukar & 
Anandasankar Ray 


Nature 502, 507-512 (2013); doi:10.1038/nature12594 


We are retracting this Article because we no longer have confidence 
in data that support one of our key conclusions. In this Article we 
reported four advances in insect repellency: identification of olfactory 
neurons in Drosophila melanogaster that participate in repellency to 
N,N-diethyl-meta-toluamide (DEET); identification of an ionotropic 
receptor, Ir40a, expressed in these neurons required for avoidance to 
DEET; development of a chemical informatics method of identifying 
shared structural features from known behavioural repellents; and 
validation ofa series of computationally identified natural chemicals 
as repellents for flies and mosquitoes. We no longer have confidence 
in data supporting that Ir40a is a DEET receptor. Upon reanalysis, the 
original calcium imaging (GCAMP) data show movement artefacts 
and background effects that we originally missed, which seriously 
undermine our confidence in Ir40a responses to DEET. In addition, 
Supplementary Fig. 5b presents several inappropriately re-used panels. 

Upon learning that A. F. Silbering et al.' did not find defects in 
DEET aversion in Ir40a mutant flies, we repeated many of the original 
behaviour experiments. Although we confirmed significant behav- 
ioural differences in Ir40a cell-silenced flies (Ir40a-Gal4; UAS-TNTG), 
as reported in Fig. 2d, we have been unable to replicate observations of 
behavioural experiments using Ir40a-Gal4; UAS-RNAi flies. Therefore, 
with the exception of author Pinky Kain, we no longer have confidence 
in the conclusions of Figs 2, 3 and 5c, and Supplementary Fig. 5. We 
remain confident of the chemical informatics analyses and the iden- 
tification of new repellents, which have been successfully repeated in 
our laboratory and by others, as reported in Figs 4, 5d and e, 6, and 
Supplementary Figs 2 and 6-9. Although it may still be possible that 
Ir40a does respond to DEET, given the issues listed above, all authors 
except Pinky Kain wish to retract this Article in its entirety. We deeply 
regret these circumstances and apologize to the scientific community. 


1. Silbering, A. F. et a/. lr40a neurons are not DEET detectors. Nature 534, E5-E7, 
http://dx.doi.org/10.1038/nature18321 (2016). 
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KATE PATTERSON/WASHINGTON POST 


CAREERS 


PAID NETWORKING A diplomatic need 
for scientists p.491 


eS A short guide to 
recognizing your skills go.nature.com/2b12gds 


For the latest career 
listings and advice www.naturejobs.com 


Adam Kavalier developed the logo and concept for his company, Undone Chocolate, during his postdoc at Weill Cornell Medical College in New York City. 


MOONLIGHTING 


Dip your toes 


Wayward eyes? Youcan explore a calling outside academia while stillin the lab. 


BY AMY MAXMEN 


| ike a marital affair, testing the waters 


of another career while working at the 

bench can feel wayward and illicit. Only 
later do scientists who have two-timed realize 
how important their digression was in shaping 
their career. 

For some, fantasies of other careers emerge 
as passion for their current research fades. “I 
knew the evolution of this group of snails I was 
studying was important to some people, but I 
started to ask myself if it really mattered,’ says 
Stephanie Aktipis, recalling her days as a grad- 
uate student at Harvard University in Cam- 
bridge, Massachusetts. For others, academia 
starts to feel impractical. Forty years ago, 55% 


of those who gained a biology doctorate in the 
United States secured a tenure-track position 
within 6 years; by 2006, that number had dwin- 
dled to 15%, and it still seems to be dropping 
(see Nature 472, 276-279; 2011). 

Given that fall, why would a graduate student 
or postdoc not explore other options? “The 
culture in academia is to keep your head down 
and get your work done, but that’s crazy,’ says 
geneticist Ethan Perlstein, who founded and 
runs a drug-discovery start-up in San Fran- 
cisco, California. “You should constantly be in 
the process of discovery. You should think of 
your career like a project and experiment.” 

For those who do branch out, the pay-offcan 
be life-altering. Aktipis interrupted her doc- 
toral work to earn a master’s degree in science 


policy, and it eventually led to her position as 
foreign-affairs officer at the US Department 
of State. Perlstein built the foundation for his 
own privately funded lab while doing a post- 
doc at Princeton University in New Jersey. He 
had become disenchanted with academia after 
receiving more than 20 rejection letters for fac- 
ulty jobs. 

Researchers who want to explore different 
options need not wait until they have finished 
their PhD or completed a postdoc or contract 
position (see ‘A delicate path’). Launching a 
start-up, completing an internship or pursu- 
ing another degree are some of the ways in 
which scientists have tested the waters of anew 
career. To explore the entrepreneurial world, 
junior researchers can attend specialized } 
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> gatherings, including those listed on 
meetup.com, in cities with thriving technol- 
ogy sectors such as Boston, London and San 
Francisco. Social-media platforms, particu- 
larly Twitter, can show what’s happening in 
various start-up spheres, and provide a way to 
contact people with similar aspirations who 
could weigh in on your proposed project. 

Many universities have incubator spaces 
where scientists can nurture start-ups. Such 
spaces usually offer free advice from alumni 
who have become chief executives, venture 
capitalists and lawyers. A neuroscience post- 
doc in the northeastern United States (who 
wished to remain anonymous for fear of reper- 
cussions at the university) recently used such 
an incubator to develop a food-related ven- 
ture with three partners. Initially, they spent 
10-20 hours each week on the start-up. Since 
incorporating the company last year, they now 
devote 30 hours a week to it. 

The postdoc manages to juggle research 
and company obligations because all the 
experimental work is done and some flexibil- 
ity is allowed in writing up publications. Once 
the postdoctoral stint is over, the researcher 
aims to focus exclusively on running the ven- 
ture — a task that requires both business skills 
and scientific expertise. “I’m not just selling,” 
they say. “Fifty per cent of what I do is technical 
and analytical.” 

Internships, too, provide windows into 
different careers, and the time commitment 
is more tightly controlled than it is for entre- 
preneurial ventures. While studying environ- 
mental microbiology as a graduate student at 
Harvard, Meredith Fisher began to think that 
she might like a science-related career in law, 
policy or business. She interned first with the 
Union of Concerned Scientists, an advocacy 
group in Cambridge, then with a local environ- 
mental lawyer and finally with Mass Energy, a 
non-profit group that promotes green energy. 

Fisher created the internships by volun- 
teering for unpaid work. Each one lasted for 
2-3 months and took 8-10 hours a week. 
Encouraged by her experiences, she decided 
after earning her PhD to pursue a Master’s of 
Business Administration, and this helped her 
to land her current position as a partner at a 
venture-capital fund in Boston, Massachusetts. 
“Even if you cant find internship opportunities 
near you,’ she advises, “so much can be done 
virtually that you can ask an organization what 
you might do for them remotely.” 


TAKING THE PLUNGE 

For some, however, toe-dipping is not as effec- 
tive as total immersion. That was the route 
taken by Aktipis. Rather than waiting to com- 
plete her PhD on snails, she took a year’s leave 
of absence from Harvard and travelled to the 
London School of Economics and Political Sci- 
ence to do a master’s degree in science policy. 
The programme delayed her PhD, but she 
learned that she loves life in the policy world. 


EASING THE WAY 


A delicate path 


Moonlighting requires a time commitment 
that is likely to be noticed. Those who 

have taken the plunge say that anyone 
considering doing likewise should consider 
discussing it with their principal investigator 
(Pl). At at the very least, the PI can try to 
understand changes in your schedule. 

For Rachel Haurwitz, co-founder and 
chief executive of Caribou, a start-up that is 
developing the genome-editing tool CRISPR, 
that discussion turned into a dream come 
true. When Haurwitz began her PhD at the 
University of California, Berkeley, she kept 
quiet about her interests in industry. But 
her ears pricked up whenever her adviser, 
Jennifer Doudna, mused about potential 
commercial applications of technologies 
developed in the lab. When those musings 
turned to CRISPR, the subject of Haurwitz’s 
doctoral research, Haurwitz confessed to 
her interest in biotechnology. Soon the two 
were brainstorming names for the CRISPR 
company that they would co-found. 

Writing CRISPR-related patent 
applications taught Haurwitz that she did 
not enjoy patent law — one career that she 
had quietly considered in her first year of 
graduate school. But she loved building the 
company. During her PhD programme, she 
took a class taught by venture capitalists, 
which paid off when she later became chief 
executive of Caribou. 

Students considering jobs outside 
academia can learn which lab heads 
will be open to the idea by listening for 
encouraging comments about former 


Rachel Haurwitz, co-founder of Caribou. 


Prasanna Bhogale took a similar approach 
by diving into a full-time internship. Towards 
the end of his PhD in physics at the University 
of Cologne, Germany, he was feeling unsure 
about a postdoc. Rather than wait to finish his 
PhD, he applied for internships that he found 
on LinkedIn and other sites. 

When he got an offer for a paid six-month 
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students and colleagues now working in 
industry, or about discoveries being made 
outside the ivory tower. When Meredith 
Fisher was being interviewed for graduate 
school, she homed in on mentors who 
were likely to be supportive by telling every 
professor whom she spoke to that she did 
not intend to continue in academia after 
her PhD. Many senior faculty members 
indicated their disapproval, and Fisher, 
now a partner at a venture-capital fund in 
Boston, Massachusetts, ruled them out. 

Some Pls expect a full commitment to 
lab research, and see themselves purely 
as mentors in academic science. Certain 
graduate students and postdocs say that 
they regret having been frank with their 
bosses because it cost them grant money 
or teaching opportunities, and it sometimes 
caused a personal rift. Discretion may be 
the best choice. A neuroscience postdoc 
who is nurturing a food-related start-up 
never works on the company in the lab, 
and weekly catch-ups allay any concerns 
that the PI might have about the progress 
of the research. In addition, many science 
departments have policies on outside 
employment, particularly when the 
university is supporting a researcher with 
funds. Before earning money on the side, 
or becoming a stakeholder in a company, 
students should learn about these policies 
from their institution’s human-resources 
staff or office of graduate-student affairs. 

Those who decide to have ‘the talk’ with 
an adviser should be positive about their 
research and open about their long-term 
goals, Fisher says. “Rather than say, ‘I 
don’t want to do this, you might say that 
your talents may lie somewhere else,’ she 
advises. Beforehand, consider how an 
adviser could help you, and try to recognize 
their needs so that you can pre-empt any 
concerns. And be prepared for objections. 
“| was asked why | wanted to waste my 
time with a PhD when | didn’t intend to be 
a professor,’ Fisher recalls. “I would say my 
PhD offers the training | want in science, and 
the recognition it provides will be a boon to 
my future career.” She adds: “Be proud of 
what you're doing.” A.M. 


stint with a German investment company, he 
put his research on hold and accepted the post. 
Although he learned that finance is not his 
calling — he'll start work at a small analytics 
company in Cologne this year after complet- 
ing his degree — he says that the internship 
transformed his thinking. “For the first time, I 
learned I had marketable skills,” he says. “This 


DON FERIA/AP IMAGES FOR CARIBOU BIOSCIENCES 


ALEKSIS KARME/TEATIME RESEARCH 


Aleksis Karme is juggling his PhD in palaeontology with running his own virtual-reality company. 


was psychologically hugely important, since 
academia can be a soul-crushing experience. 
It was extremely liberating to find out I was 
appreciated in other contexts.” 


SMART WORKING 

Adding ten or more hours a week into an 
already-busy schedule is not easy, so efficiency 
is vital. Researchers who have successfully 
managed the extra workload suggest making 
aschedule for each day. Anything that does not 
contribute to the goal of the research should 
be cut. Social media, e-mails, phone calls 
and other potential time-wasters should be 
restricted to certain hours. 

Even so, scientists who are committed 
to a side pursuit say that they sacrifice their 
social life, and sometimes their rest. “I didn't 
do much else besides my research and mak- 
ing chocolate,’ says Adam Kavalier, a chemist 
who developed the logo and concept for his 
company, Undone 


Chocolate, duringhis  « For the first 
postdoc at Weill Cor- time I learned I 
nell Medical College had marketable 
in New York City. “T ° ° 
didn’t sleep much: skills, This 

' was hugely 


I worked weekends 
and nights. I did not 
take vacations,” he 
says. “But I already had a passion for making 
chocolate, and once I decided to make it a busi- 
ness, that became an obsession.” 

Passion also drew Aleksis Karme, a PhD 
student in palaeontology at the University of 
Helsinki, to his test project. He co-founded a 
virtual-reality company, and is now juggling 
that with his dissertation. 

He admits to overworking, but the upside 
is that he can apply techniques that he cre- 
ates for his lab work to real-world projects in 


important.” 
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construction and engineering. “I can control 
my own career better this way,’ he says. “I'm 
finding a way around the conveyor belt from 
a PhD toa postdoc.” 


NO REGRETS 

Many researchers who eventually left aca- 
demia for other jobs theyd sampled while at 
the bench say that they published less than 
they might have done had they focused on 
science alone. But that never worried them, 
because they knew that it would not matter 
in the long run. Meanwhile, those who stayed 
in academia say that they do not regret tem- 
porarily veering from the path. 

As a postdoc, microbiologist Robin 
Kodner served on the board of a biofuel 
start-up and consulted for the biofuel in- 
dustry. The experience assured her that 
academia was the right choice for her, and 
she is now a faculty member at Western 
Washington University in Bellingham. 
“Venture capitalists would ask me if I could 
have a product in two years, and I was like, 
are you joking? Being from academia, I 
was totally comfortable saying, “Well, we 
don't really know that yet, and here are 
the caveats on what we do know’.” she says. 
“But that’s not how you talk with investors. 
I wasn't a good fit. I’m still so glad I did it 
because I got a taste of what that life was 
like, and at the same time I didnt really lose 
traction on my academic career.’ 

Junior researchers need to remember 
that looking elsewhere is not cheating. “My 
advice is that you have control over your life,” 
Aktipis says. “Know you can do this. Know 
itis an option.” m 


Amy Maxmen is a freelance writer in 
Berkeley, California. 
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TRADE TALK 
Binational liaison 


After completing a 
PhD and postdoc in 
neuroscience, Sabine 
Blankenship took 

a job as the science 
liaison officer 

at the German 
Consulate General 
in San Francisco. 
She describes how 
she helps her compatriots to stay up to date 
with scientific developments across northern 
California and the Pacific Northwest. 


What do you do? 

I build professional networks, engaging 
researchers and learning about scientific 
developments. Recently, I participated in an 
industry day at Lawrence Berkeley National 
Laboratory, California. Before that, I spoke 
to relevant researchers to learn about the 
CRISPR-Cas gene-editing technology and 
wrote a report for German government 
officials. When German delegations visit, I 
organize their agendas. I can spend a lot of 
time at the computer requesting meetings, 
but our work depends on getting out and 
getting to know people. The communicative 
partis something I totally enjoy — that and the 
ability to keep learning. 


How do you apply your training? 

Very rarely is it factual knowledge. Anything 
in life sciences, I can follow in depth, but ’m 
also responsible for energy and information 
technology. What I really learned in graduate 
school was how to research something deeply 
and structure that information into a useful 
format. Also, self-management is important. 
I have to take the initiative and see that my 
projects keep running. 


How did you get the job? 

What tipped me to this position was a con- 
ference run by the German Academic Inter- 
national Network, which helps German 
nationals to find jobs back home. I thought 
it was the wrong conference to go to since I 
wanted to stay in the United States, but then 
I found the job ad for the consulate. Friends 
introduced me to two people who work at 
consulates and I had coffee and lunch with 
them. Talking to them made me realize that 
consulate work is something I could be good at 
and love. And it proved to be useful for prepar- 
ing for the job interview. m 


INTERVIEW BY MONYA BAKER 
This interview has been edited for length and 
clarity. See go.nature.com/2auauki for more. 
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Ua SCIENCE FICTION 


INTERDIMENSIONAL TRADE BENEFITS 


A powerful argument for cooperation. 


BY BRIAN TRENT 


The admirals of the Brightworld 

Imperium blinked from where they 

sat around the crisis room. “What do you 

mean?” one of them said at last. “A missile? 
From where? From whom?” 

Dr Harshadi Hennig stood at the centre 
of the rounded chamber, feeling less like a 
scientist giving a presentation and more like 
a condemned prisoner. She raised her holo- 
gloves, conjuring an image of Eris Station 
orbiting Neptune’s serene planetscape. 

“This is a recording from an approaching 
shuttle,” Hennig said. 

The Neptunian station was a grey wheel 
against the void. The time index ticked away 
until — at the 18:43:30 mark — a greenish 
light split the blackness. It seemed as though 
a massive Venetian blind was opening, an 
emerald sky materializing against cold space. 
A colossal, spire-like shape appeared. Then 
the light blinked out, the odd shape disap- 
pearing with it. The shuttlecam trembled in 
aftershock. The orbital station was — 

“Gone, one of the admirals snapped in the 
impatient tone used for all plebeians of the 
Empire. “Yes, Dr Hennig, we know this. Tell 
us what the devil happened to Eris Station!” 

She bowed obediently. “Of course, Admi- 
ral. The station was destroyed by a missile 
fired from another dimension” 

“Tt... what?” 

“A missile. Not fired by insurrectionists 
or terrorists, but from a universe next door” 

She conjured a still-frame of the greenish 
rupture with her gloves. 

“Tt appears,” she continued, “that the multi- 
verse theory is correct. This video shows the 
splitting of a dimensional membrane, and the 
brief materialization of what we believe is the 
alien equivalent of an ICBM?’ 

The admiralty board conferred with each 
other. “Are you saying that multidimensional 
aliens attacked us?” one cried. 

“Well... not exactly.” She clapped her 
hands and the hologram vanished. “If not for 
this lucky recording, wed never have known. 
You see, this wasn’t an attack. Frankly, our 
extraterrestrial neighbours didn't know life 
occupied this side of the membrane at all.” 

“Dr Hennig, you're not making any 
sense —” 

“What we saw,’ the scientist dared inter- 
rupt, “was an interdimensional weapon test.” 

Again, a deadly silence filled the chamber. 
“Explain,” someone managed. 


¢C C L was a missile,” she explained. 


“Tm not sure how much I can explain. Like 
us, these aliens are involved in an arms race. 
Like us, they're always experimenting with 
bigger and better technology. Build a bomb, 
your enemies build a shield. So you build a 
better bomb. What we just witnessed ...” 

“Yes?” 

“... was the test of an extradimensional 
missile. Put simply, it’s a missile that winds up 
into a higher dimen- 
sion — our dimen- 
sion — and is then 
fired back into their 
dimension, in order to 
achieve extraordinary 
hypervelocities. Imag- 
ine having a fist-fight 
a few feet underwater. 
You're limited in how 
much power you can 
pack into a punch. 
But if you draw your 
fist through the air 
above...” 

She flexed her gloved fingers again, 
replaying the holo to a molasses-like crawl. 
The freakish alien missile appeared, and was 
gone in a blink. Eris Station was pulled in 
after it, like debris sucked into an undertow. 

“That’s what happened to Eris Station,” 
Hennig said decisively. “It got pulled into 
a neighbouring dimension along with the 
missile.” 

She noticed a new expression washing 
through the admiralty board. Excitement. 

“Tell us more...” one said, practically lick- 
ing his lips. 

“Certainly, sir. The test resulted in the 
deaths of millions of our alien neighbours.” 

“No doubt. Looks like a badass missile” 

“No, sir, you misunderstand. The missile 
detonation was fierce, yes, but it was a local- 
ized event. The alien test inadvertently caused 
avirulent epidemic that spread through halfa 
continent.” She regarded her audience's per- 
plexed faces. “Because when the people of Eris 
Station were sucked into the interdimensional 
wake, their organic matter caused a deadly 
epidemic in the universe next door. Put sim- 
ply, the aliens wanted a fast missile and ended 
up creating a bioweapon.” 

“Will this ... happen again?” 

“It’s been happening, all over the Solar Sys- 
tem. Brief flashes of light. Navigator reports 
of odd, transitory shapes glimpsed in space. 
But the aliens haven't been able to achieve the 
original results, because they aren't acciden- 
tally sweeping up human beings with their 
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subsequent tests. Unfortunately, in their 
determination to recreate the effects, they've 
been increasing their tests. It's only a matter 
of time before a dimensionally spun-up mis- 
sile materializes in the same space asa planet” 

The admirals looked stricken. 

“Dont worry, Hennig said. 

“Dont worry?” an admiral mocked. 
“This stupid pleb has discovered a threat 
to the Imperium and 
she's telling us not to 
worry!” 

“Perhaps she’s found 
a way to neutralize this 
threat?” another sug- 
gested. 

“Neutralize?” Hen- 
nig scratched her 
head. “Not exactly. 
You see, over the past 
few weeks I managed 
to establish contact 
with our interdimen- 
sional neighbours. 
Oh, it took a while, but we worked out a 
rudimentary form of communication. Sur- 
prised the hell out of them ... but it’s been 
productive, after a fashion” 

“Wait, you initiated communication with 
a foreign power —” 

“We made a deal,” she spoke over the 
admiral. “They agreed to stop blindly testing 
their missiles in our dimension. In return...” 
She smiled. “We give them coordinates for 
where they should be testing their missiles.” 

She saw the realization kindle in their 
eyes. Or maybe, she thought, that was just 
the kindling of greenish light suddenly 
appearing around them. 

The admiralty board leapt to their feet. 

“We plebs are tired of being slaves to the 
Imperium,’ Hennig had to shout above the 
escalating sound. “So we sent the aliens 
coordinates of your military bases. The 
aliens get their bioweapons, and we get to be 
free of you. Everyone wins!” 

One of the admirals screamed in the 
intensifying light. “You'll die too!” 

Hennig looked puzzled. “Me? Oh... 'm 
not even here.” 

She clapped her gloves, and her hologram 
vanished. = 


Brian Trent’ speculative fiction appears 
regularly in ANALOG, Fantasy & Science 
Fiction, COSMOS, Galaxy’s Edge and 
numerous year’s best anthologies. His 
website is www. briantrent.com. 
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aking a career decision can 
be like trying to solve an 


equation. Both tend to be 
easier when they involve fewer variables. 
Unfortunately, scientists who are thinking 
about where to work to achieve their goals 
must balance multiple priorities, often on 
the basis of poor quality data. 

One thing that is clear is that as 
competition for research positions and 
funding grows in the West, opportunities 
are opening up in the East. China doubled 
the proportion of its gross domestic 
product that it spends on research and 
development (R&D) from 1% in 2002 
to 2% in 2014 — and it plans to increase 
this to 2.5% by 2020. By then, the 
Organisation for Economic Co-operation 
and Development predicts that China will 
have overtaken the United States as the 
country that spends the most on R&D. 

South Korea's investment of 4.3% of 
its GDP on R&D in 2014 was the largest 
of any country in the world. In January, 
Singapore unveiled a five-year plan to 
boost government support for R&D 
by 18%. Japan, while also increasing 
funding, is stepping up its efforts to attract 
international students and researchers 
by making more tuition available in 
languages other than Japanese. 

Money isn't everything, of course. 
Australia continues to be a magnet for 
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international researchers by focusing on 


established strengths such as nanoscience. 
New Zealand also concentrates on 

its specialities such as earth and 
environmental sciences. Both countries 
have the happiest people of the nations 
profiled in this supplement. New Zealand 
came eighth and Australia ninth in the 
United Nations World Happiness Report 
2016 Update. 

India is not yet a major player in 
science. Still, some of its leading 
institutions increasingly see international 
recruitment as key to the country 
achieving its scientific ambitions. 

The 2016 Naturejobs Career Guide to 
the Asia-Pacific contains facts and figures 
on metrics such as research quality, the 
proportion of the workforce in research 
and development, salaries and cost of 


living. It also presents tips from employers 


and employees on how to get a job and 
how to deal with cultural differences in 
each of the seven featured countries. 
The hope is that this information 
can help simplify the decision-making 
equations faced by those thinking 
about moving to the region, and in turn 
increase the chance of finding career 
solutions that meet their expectations. 


Nic Fleming 
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RELOCATING SCIENCE 


Countries are spending more than ever on research and development, but the fields they fund 
vary depending on national priorities. And itis not just the research reputation that matters 
when choosing whether to move abroad — cost of living and quality-of-life are factors too. 


HOW SCIENCE IS SPREAD 
There is no doubt of China’s dominance when it comes to publishing research in high-quality journals. Its weighted fractional 
count (WFC; a score used by the Nature Index, see description below) was 5% higher in May 2015 to April 2016 compared 
with 2014. With the exception of New Zealand, the WFC of the other Asia-Pacific countries profiled fell. 


MAINLAND CHINA 


SOUTH 
KOREA South Korea’s 
INDIA continued drive to 
increase 
j investment in 
research and 
development (R&D) 
as a proportion of 
gross domestic 
SINGAPORE product (GDP) is 
not reflected in its 
WFC, which fell by 
12% between May 
Mainland China’s WFC exceeds that ae ancl Apel 
‘ : oa compared 
of the other profiled Asia-Pacific with 2014 
countries put together. The rate at : 
which its research output grew was 
also the highest in the region. 
AUSTRALIA 
WFC 
98 
With a population of around 5.4 million, Singapore would be 
the most productive country if WFC were calculated per capita. 
NEW ZEALAND 
SUBJECT STRENGTHS KEY 
China devoted more of its research efforts to physics and less to health and medical sciences in 2015 than any of the other countries 
profiled, according to the number of academic journal articles in the Scopus database. Articles on boundaries between subjects can 
be counted more than once, potentially distorting the relative subject proportions in these charts. @ Astronomy 
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NATURE The Nature Index database tracks the affiliations of high-quality scientific articles, and charts publication productivity for institutions and countries. Weighted fractional count (WFC) 

INDEX accounts for the relative contribution of each author to an article and applies a weighting to correct imbalances in the index’s subject coverage. This Career Guide draws on Nature Index 
data derived from articles published between 1 May 2015 and 30 April 2016. WFC is used throughout this supplement as the primary metric, because it provides an even basis for 
comparison. For more information, visit natureindex.com/faq 
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FUNDING OVER TIME SPENDING PER RESEARCHER 

R&D funding as a proportion of GDP has grown most rapidly in South Korea and Of the profiled countries, Singapore spends the most on R&D per researcher. Figures 
China since 2000, but has remained consistently low in India and New Zealand. are the most recent available and are normalized for purchasing power. The United 


States and United Kingdom are included for comparison. 
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Singapore comes out on top for national research quality. Quality is based on South Korea has more of its workforce engaged in research than any of the other 
the percentage of each country’s biological and physical science articles in the profiled countries. As a proportion of the working population, South Korea has around 
Scopus database that make it into the Nature Index. 30 times more researchers than India, according to the latest data available. 
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COST OF LIVING LIFE SATISFACTION 
Major cities in Australia are the most expensive places to live among the When people of 157 countries were asked to rate life satisfaction on a scale of zero to ten, 
countries profiled, on the basis of an index of the relative ability of people on New Zealanders and Australians were the happiest people of the seven profiled countries. 
average salaries to buy goods and services. Denmark (the happiest) and Burundi (the least happy) are included for comparison. 
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China’s Five-hundred-meter Aperture Spherical Telescope will help the search for extraterrestrial life. 


CHINA 


As China continues to increase its investment in research, it is 
offering opportunities that can be difficult to find elsewhere. 


BY REBECCA KANTHOR 


US astronomer Eric Peng was working at a 
rented desk at Peking University in Beijing 
when he was approached by a fellow scientist. 
“You keep coming back,” the man said. “Are 
you looking for a job here?” Six months after 
this casual conversation, Peng was offered a 
faculty position in Peking University’s Kavli 
Institute for Astronomy and Astrophysics. “I 
thinka lot of people thought it was an obvious 
thing to do, but I never thought I would actu- 
ally move here,’ 
China’s economy has been growing more 
slowly in recent years: the gross domestic prod- 
uct (GDP) grew by 6.9% in 2015, compared 


I n 2007, while visiting his then girlfriend, 


with 10.6% in 2010. But foreign researchers say 
that there has been no cooling of the enthusi- 
asm for enticing talented scientists to work in 
the country. Overseas and expat scientists are 
persuaded with promises of generous funding 
and research opportunities. China's most recent 
five-year plan, for example, pledged to continue 
with its drive to increase research and develop- 
ment funding as a proportion of GDP — from 
1% in 2002 and 2% in 2014, to 2.5% in 2020. 
The government launched its 1000 Talents 
plan in 2008 to attract scientists, entrepreneurs 
and finance experts from across the world. 
Precise figures for how many foreign scientists 
work in China are hard to come by, but 
according to anecdotal reports the incentives 
are working at least in some places.  PAGES8 > 
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JIANG GE 
Associate provost at 
ShanghaiTech University 


Why should foreign scientists come 

to China? 

The economy is strong, so the conditions 
are good and money for research is 
very stable. In fact, the government is 
increasing investment in basic research, 
not only for large-scale facilities, but 
more generally. | see Chinese people 
coming back from working in other 
countries, as well as foreigners coming 
to work here. Although some have had 
good opportunities abroad, here they 
have excellent equipment and there is 
support for those starting up their 

own research. 


What do Shanghai and ShanghaiTech 
University offer foreign researchers? 
Shanghai is unique in being a huge city 
with such a high density of facilities. 
And there are more facilities planned 
for the future. Researchers want a 
good research environment, with good 
academic partners and the funds to 
purchase the materials they need. That 
is what is on offer at ShanghaiTech 
University. Shanghai is also home 

to many scientific centres at which 
researchers from all over the country 
work on projects to serve the national 
interest. These include the Shanghai 
Synchrotron Radiation Facility, the 
National Center for Protein Science and 
the Free Electron Laser Facility. These 
large-scale projects are funded by the 
central government and the Shanghai 
local government. 


What skills do foreign researchers need 
to work in China? 

We don’t have enough top-quality 
researchers to help China to reach the 
level of research we are aiming for, so 
that’s what we’re looking for. Shanghai 
is an international city so itis not 
essential to speak Chinese — English is 
widely understood among the research 
community. R.K. 


This interview has been edited for 
length and clarity. 


WHERE TO WORK 


The top ten institutions in mainland China, based on research output included in 
the Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count 
(WFC), a measure of the relative contribution of an author to an article weighted to 
correct for imbalances between subjects. Bars are divided according to the 
proportion that each subject area contributes to the overall score. 
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Chinese starting salaries are among the lowest in the Asia-Pacific countries 
profiled, especially for those lower down the pay scale, according to data 
collected in Nature’s interviews. 
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COLLABORATIONS 

China’s average collaboration score (top) — the sum of Nature Index’s fractional 
count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries China collaborates with. 
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Overlaps in subject areas may cause some distortion to relative subject proportions. 
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RESEARCH FOCUS 

The search for extraterrestrials gets a boost this year with 

the completion of China’s new Five-hundred-meter Aperture 
Spherical Telescope (FAST). As well as hunting for life on other 
planets, the world’s largest radio telescope, which was finished in 
July after 22 years of planning and 5 years of construction, will 
also allow astronomers to study pulsars — the cores of exploding 
stars — and survey hydrogen in faraway galaxies. 

Built at a cost of 1.2 billion yuan (US$180 million) in a valley in 
the remote, mountainous region of the southern Guizhou province, 
FAST is made up of 4,450 triangular panels fitted into a giant dish 
with a surface area of 36 American football fields. Its 500-metre 
width smashes the previous record, held by the 305-metre wide 
Arecibo radio telescope in Puerto Rico. Some 9,000 people have 
reportedly been relocated to allow FAST to be built. 

The enormous size of this instrument, which has been 
nicknamed ‘Sky Eye’, makes it more powerful and sensitive than 
previous radio telescopes and allows it to survey a larger part of 
the sky. Because it should be capable of picking up weaker radio 
signals from further away than any previous telescope, FAST will 
be particularly useful in the hunt for extraterrestrial life. 

FAST forms part of China’s plan to build up its space 
programme. Having put its first astronaut in space in 2003 and 
having launched its first unmanned lunar expedition in 2013, 
the country is planning to have an operational space station by 
around 2020 and to put a person on the Moon by 2030. m 


Staff at the 
construction 
site of 
FAST — the 
world’s most 
_ powerful 
radio 
telescope. 
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> Peng says that half of the postdocs and 20% 
of the faculty at his institute are from abroad. 

The rapid growth of research funding 
in China makes it an attractive place for 
researchers looking to set up their own lab- 
oratories and projects. For Peng, it meant 
the opportunity to co-found the Telescope 
Access Program. Now in its sixth year, the 
US$13-million project gives Chinese astrono- 
mers observation time on optical and infrared 
telescopes around the world. 

Ross Howie, a physicist who completed 
his PhD at the University of Edinburgh, UK, 
three years ago, is enjoying the greater auton- 


“| HAVE ALMOST COMPLETE FREEDOM 
TO PURSUE MY RESEARCH GOALS.” 


omy his move to China in 2015 has allowed 
him. He is investigating materials synthesis 
under extreme pressures and temperatures 
at the Center for High Pressure Science and 
Technology Advanced Research in Shang- 
hai. “I have been given a platform to build 
my own laboratory and have almost com- 
plete freedom to pursue my research goals,” 
says Howie. 


ENTRY REQUIREMENTS 

¢ Employers must apply for work permits on 
behalf of employees. This includes presenting 
CVs, university transcripts and certificates, and 
making the case that no Chinese national can 
fulfil the job requirements. 

* Once a work permit has been granted, 
employees must apply for a single-entry work 
visa called a Z visa. 

* On arriving in China, the employee and their 
employer have one month to apply for a 
residence visa. The visa can take up to three 
months to process and is valid for one year, 
after which it must be renewed annually. 

* The bureaucratic hurdles associated with visas 
and work permits can be frustrating; some 
have reported having to fly back to their home 
country to wait for documents. 


Despite such opportunities, relocating to 
China can come with challenges. Many grant 
proposals must be written and defended in 
Mandarin, although some organizations, 
such as the National Natural Science 
Foundation of China, have started to accept 
grant proposals in English. Scientists complain 
of excessive paperwork and frequently 
changing policies. Research institutions can 
reallocate grant money to other projects that 
don't have funding. And Internet censorship 
hinders research. 

Jordanian neuroscientist Nashat Abumaria, 
at Fudan University in Shanghai, has seen such 
challenges defeat a number of foreign research- 
ers since he arrived in the country in 2007. But 
he enjoys the drive and ambition of his Chinese 
colleagues, some of whom he has developed 
strong friendships with. Abumaria has turned 
down job offers from elsewhere and plans to 
stay as long as he can. “I’ve learned one impor- 
tant thing,” he says. “You can not expect China 
to change for you. You must change and adapt 
with China.” 

Peng advises those willing to give working 
in Chinaa try to think like an entrepreneur to 
make the most out of their experience. “There 
are definitely failings in the system,” he says. 
“But there are a lot of advantages too.” = 


CHINA HAS 50 UNESCO WORLD 
HERITAGE SITES, SECOND IN THE 
WORLD ONLY TO ITALY. 


OPPORTUNITIES & CONTACTS 

¢ The 1000 Talent Plan of Foreign Experts 
offers attractive salary and research funding 
packages to those willing to work in China 
for three years. 

* The 1000 Talent Plan for Young 
Professionals seeks to attract scientists aged 
under 40 who are seen as future leaders in 
their fields. 

* The President’s International Fellowship 
Initiative provides Chinese Academy of 
Sciences funding to PhD students and 
postdocs. 

* The National Natural Science Foundation 
offers grants to both foreign and returning 
Chinese scientists seeking to collaborate 
with Chinese scientists. 


S8 | NATUREJOBS CAREER GUIDE | ASIA-PACIFIC 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


SAMANTHA DU 
Chief executive officer at Zai 
Laboratory, Shanghai 


What is it like working in biotechnology 
in China? 

The government wants to promote 
innovation, so itis very interested in 

the biotech sector. Both Chinese and 
global investors are keen to invest in it. 
Many Chinese entrepreneurs, people 

in the pharmaceutical industry and 
researchers want to come back to 
China to start their own businesses. 
Compared with the United States and 
other developed countries, Chinese 
biotech is still at an early stage, but it’s 
catching up fast. The sector is booming. 


What skills are biotech companies 
looking for in foreign scientists? 

They are the same as any US biotech 
company. | see a lot more opportunities 
opening up. We have recruited a group 
of overseas talent that has tremendous 
experience in drug discovery and 
development. 


How will foreign biotech scientists 
benefit from working in China? 

It will help them to understand biotech 
culture and pharmaceutical research 
in China. Eventually, if they want to go 
back to their home country they will 
have a lot of connections here. It also 
gives them the opportunity to return to 
China later, possibly as a representative 
for US or European companies. 


What is the best thing about working in 
biotech in China? 

It’s very dynamic. For example, the 
government recently approved a 

pilot plan that will allow contract 
organizations to produce drugs. Those 
working in biotech in China are at the 
forefront not just of scientific discovery, 
but also of a generation pushing for 
new regulations that will pave the way 
for the development of truly innovative 
medicines in China. R.K. 


This interview has been edited for 
length and clarity. 
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The Mercury Magnetospheric Orbiter will study the planet’s atmosphere and magnetic field. 


JAPAN 


The government is stepping up efforts to attract international 
scientists, as the country invests record sums in research. 


SMRITI MALLAPATY 


( evayir Coban has witnessed a major 
cultural shift in Japanese universities 
since arriving in the country in 2003. 

Coban, her husband and their one-year-old son 

had left the United States so that she could join 

the laboratory of immunologist Shizuo Akira 
at Osaka University. 

At first, Coban, a malaria immunologist 
from Turkey, kept herself to herself because she 
couldn't understand what her colleagues were 
saying. “The attitude back then was: ‘If you dont 
speak Japanese, you should learn it;” she says. 

That sense of isolation has eased, partly 
because she learned the language, but also 
because universities have changed their 
approach to non-Japanese speakers: they now 
seek out top-level international researchers 
to work in their labs. “In the past ten years, 


universities and institutions all over Japan have 
made an active effort to globalize; Coban says. 
“The attitude today is: “You don't have to speak 
Japanese to work here” 

Around the time of Coban’s arrival, the 
Japanese government, mired by a long period of 
economic stagnation, recognized that the coun- 
try was ill-equipped to operate in the rapidly glo- 
balizing international research environment. In 
2007, it set out its World Premier International 
Research Center Initiative (WPI), a plan to 
establish centres of excellence that would attract 
talented international researchers. Coban now 
heads her own lab at one of these centres: Osaka 
University’s Immunology Frontier Research 
Center (IFReC), which is led by Akira. The cen- 
tre is fully bilingual, 25% of its researchers are 
non-Japanese and it employs a team of staff to 
help researchers to relocate and deal with the 
inevitable paperwork involved in PAGESI6 > 
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CAROLINE FERN BENTON 


Vice-president at the University of Tsukuba 


What kind of researchers do you hire? 
The university looks for individuals 
who are open to cross-disciplinary and 
cross-border collaboration. Being an 
open, global university is very much 
part of our DNA, and itis reflected in 
our philosophy and history. One of our 
prominent founding figures was Kano 
Jigoré — the creator of modern judo and 
the first person to bring foreign exchange 
students to Japan. 

In 2014, as part of a national effort 
to raise the international prominence 
of Japanese universities, we introduced 
an initiative to become a trans-border 
university in every sense of the word. 
We want to cross national, institutional, 
disciplinary and academic—industry 
borders. We are aiming to triple the 
number of foreign researchers at 
Tsukuba to more than 300 by 2023. 


Why are Japanese universities so keen 
to globalize? 

The number of 18-year-olds in Japan, 
and therefore the number of prospective 
university students, is declining. This 
means that we will need to bring in 
bright, young people to fill the university 
places and eventually to join the 
workforce. Research and education have 
also become much more global. We 
need to motivate our researchers to go 
abroad to expand their horizons. 


What do you enjoy most about working 
in Japan? 

Japanese culture values harmony and 
teamwork. This means you feel a sense 
of belonging to a group. These values 
are promoted through relationships 
and a common understanding, so 
employers tend to take a longer-term 
view of employment and business. Even 
research labs at private companies tend 
to have a much longer-term focus than 
US companies, for example, which face 
greater pressure to deliver short-term 
results. S.M. 


This interview has been edited for 
length and clarity. 
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The top ten institutions in Japan, based on research output included in the 2015 
Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count (WFC), 
a measure of the relative contribution of an author to an article weighted to correct 
for imbalances between subjects. Bars are divided according to the proportion that 
each subject area contributes to the overall score. 


@ Chemistry 
@ Earth and 

environmental SAPPORO 8} 

sciences 
© Life sciences 
@ Physics 
* Major industry 

employer 

e— TOKYO 
e KY010 —e Ce | e660 
| . % Sony 
FUKUOKA —e 
0 3 ] OSAKA * Toyota 
3% Panasonic 

SALARIES 
Starting salaries in Japan sit around the middle of the list of Asia-Pacific 


countries profiled, according to data collected in Nature’s interviews. 
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COLLABORATIONS 

Japan’s average collaboration score (top) — the sum of Nature Index’s fractional 
count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries Japan collaborates with. 
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RESEARCH FOCUS 

Japan will become the first Asian country to name a chemical 
element, it was announced in the final hours of 2015. An 
international committee of chemists had attributed the discovery of 
element 113 to a team led by chemist Kosuke Morita at the RIKEN 
Nishina Center for Accelerator-Based Science in Wako. 

Elements with more than 92 protons are unstable and decay 
into lighter, more stable atoms after only a fraction of a second. 
Proving the existence of these ephemeral elements is therefore very 
difficult. Morita’s team began its search for element 113 in 2003 
by bombarding atoms of the heavy metal bismuth with beams of 
zinc travelling at one-tenth of the speed of light. It took the group 9 
years to produce convincing evidence that it had created an isotope 
of an element with a nucleus of 113 protons and 165 neutrons, and 
another 3 years for the feat to be formally acknowledged. 

The team has proposed the name nihonium, with the symbol Nh, 
after Nihon, Japanese for ‘Japan’. The elements place, on the seventh 
row of the periodic table between copernicium and flerovium, is 
expected to be confirmed by the International Union of Pure and 
Applied Chemistry in November, following a five-month public 
consultation on the name. 

In January, the government adopted a five-year plan that identifies 
robotics, sensor and actuator technologies, biotechnology, human- 
computer interaction, nanotechnology and quantum technology 
as areas of existing research strength that should be prioritized for 
investment. The government is also committed to funding a major 
cybersecurity project to safeguard crucial transportation, energy and 
communications infrastructure. m 


Chemist 
Kosuke 
Morita 
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discovered 
element 
113. 
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> applying for grants and other administration. 
In 2014, the Japanese government stepped up 
its efforts to make the country’s higher educa- 
tion more competitive and compatible with the 
rest of the world by selecting 37 universities to 
receive funding under its Top Global University 
Project. The programme aims to increase the 
proportion of researchers at these institutions 
who are either foreign-born or Japanese with 
overseas degrees from 28% to 47%. Moreover, 
the aim is that by 2023, 22% of classes will be 
taught in languages other than Japanese — close 
to triple the current percentage. “We need to 
capture the power of globalization and openness 


“THE ATTITUDE TODAY IS: 
‘YOU DON'T HAVE TO SPEAK 
JAPANESE TO WORK HERE’.” 


to address global and social challenges together,” 
says economist Yuko Harayama, an executive 
member of the Japanese government's Council 
for Science, Technology and Innovation. 

Despite these efforts, Japan is still falling 
short of its goals. The number of international 
researchers in appointments and placements 
of more than 30 days at Japanese academic and 
research institutions reached a peak of around 
15,100 in 2012 — only a slight increase on the 
12,800 researchers in 2003, and a small propor- 
tion of the 844,000 researchers in Japan. 


“We have been promoting globalization for 
more than 20 years, but have not yet achieved the 
status we would like,’ says Harayama. Universi- 
ties are under multiple pressures, she adds. They 
are required to open up to foreign influences, col- 
laborate more with the private sector and focus 
toa greater degree on science that serves society. 

Japan also falters when it comes to women in 
science. Not only is Coban one of the first inter- 
national professors in the natural sciences at 
Osaka University, but she is also the only female 
principal investigator at IFReC out of almost 30. 
Nationally, women make up only about 15% 
of researchers and only 6% of corresponding 
authors — compared with an Organisation for 
Economic Co-operation and Development aver- 
age for female corresponding authors of 26%. 

But Harayama is optimistic about the future. 
In January, the government released a new sci- 
ence and technology plan, which was prepared 
with greater private-sector involvement than 
previous strategies — industry accounts for 72% 
of research and development expenditure in the 
country. Starting in April, organizations that 
employ more than 300 people are required to set 
minimum targets for employment of women and 
the proportion of women in management roles. 

Meanwhile, Japan remains committed to 
investing in science and technology. In 2014, it 
spenta record high of 3.6% of its gross domestic 
product on research and development, making 
it the third highest investor globally after South 
Korea and Israel. m 


JAPAN HAS CLOSE T0 30,000 NATURAL HOT SPRINGS, 
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ENTRY REQUIREMENTS 

* Researchers must apply for a Highly Skilled 
Foreign Professionals visa, which grants workers, 
and, if applicable, their spouse permission to 
live and work in Japan for up to five years. 

* The application process takes between one 
and three months. Researchers must first 
obtain a certificate of eligibility from their 
employer. This is issued through a system 

that awards points based on criteria such 

as academic and professional experience. 
Employees include this certificate when 
applying for a visa at their Japanese embassy. 

* On entering Japan, arrivals receive a residence 
card that they must carry at all times. Card 
holders can leave as many times as they wish, 
but they must re-enter the country within a year. 
¢ Visa-holders can apply for permanent 
residency after five years. 


OPPORTUNITIES & CONTACTS 

* The Japan Research Career Information 
Network (JREC-IN) Portal lists jobs for 
researchers in academia, industry and the 
public sector. 

* The Japan Aerospace Exploration Agency 
(JAXA) International Top Young Fellowship 
offers early-career researchers the opportunity 
to work at the Institute of Space and 
Astronautical Sciences for three to five years. 

* The RIKEN Special Postdoctoral Researchers 
Program supports young researchers engaged 
in an independent project of their choosing for 
up to three years. 

* The Japan Society for the Promotion of 
Science Postdoctoral Fellowship for Overseas 
Researchers offers grants to highly qualified 
researchers to collaborate with Japanese 
scientists for between one month and two years. 
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BILL MUNRO 

Group leader of the Theoretical Quantum 
Physics Research Group at NTT Basic 
Research Laboratories (BRL) in Atsugi 


Are there any cultural differences that 
researchers coming to work in Japan 
should be aware of? 

No one really says yes or no, so you 
need to get used to reading between 
the lines. If someone tells you that 
something ‘will be difficult’, then you 
probably shouldn’t do it. But many 
Japanese people have now learned to 
be more direct with their non-Japanese 
co-workers. 


How open is the private sector to 
international researchers? 

Many industries are hiring more 
foreigners. About 15% of NTT BRL 
employees are non-Japanese and 

this proportion is slowly increasing. 
International researchers can apply for 
two-year postdocs. Older, grey-haired 
people like myself can apply for five- 
year research specialist positions. 


Is private-sector research in Japan 
done differently? 

Overseas, the trend is for industrial 
research to become more applied, 
more product-oriented, but in Japan 
most of the big companies still carry 
out fundamental research. Companies 
here also cover a wider breadth of 
research fields, and the labs are 
incredibly well-resourced. 


Are there any downsides to working 
in Japan? 

Historically, private-sector research 
has been a very individual activity in 
Japan. This is changing, but more 
slowly than in Western countries, where 
companies are moving towards larger, 
collaborative projects. A researcher 
can only do so much on their own. 
NTT laboratories are pushing towards 
more collaboration, both internally and 
externally. S.M. 


This interview has been edited for 
length and clarity. 
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Many people coming to work in Australia find its cities cosmopolitan and welcoming places to live. 


AUSTRALIA 


Scientists from across the world are attracted to the country, 
which competes internationally by focusing on its strengths. 


BY KAREN MCGHEE 


ustralian research has delivered a series 
A of major advances, from the bionic ear 
and Wi-Fi to the cervical-cancer vaccine 
and the black-box flight recorder. It has played 
crucial roles in the development of Google Maps, 
penicillin and ultrasound medical imaging. 
The country’s history of innovation and its 
model of scientific development are seen by 
many as consequences of its geography and 
demography. Australia’s comparatively small 
population of 24 million lives in a vast country 
that is not only far removed from the world’s 
major centres of development, but also dom- 
inated by an arid interior. It’s of little surprise, 
then, that Australians have a reputation for 
resourcefulness and self-reliance. 
Melbourne and Sydney, which are home 


to almost half the population, are the nation’s 
hotbeds of scientific endeavour. Each has more 
than a dozen internationally ranked universi- 
ties and research institutes that have assembled 
international teams in specialist fields. 

“We eat lunch every day with German, 
American, French, Chinese, Iranian, Spanish, 
Singaporean and Australian researchers,” says 
J. J. Richardson, a bioengineer at Australia’s 
largest research body, the Commonwealth 
Scientific and Industrial Research Organisa- 
tion (CSIRO) in Melbourne. Richardson, who 
moved to Australia from the United States in 
2010, has been working alongside his Chinese 
colleague Kang Liang on biological applications 
ofa class of crystalline materials called metal- 
organic frameworks. 

Liang says that Australia has a reputation 
for making occasional, rather than PAGES20 
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LINDA KRISTJANSON 
Vice-chancellor at Swinburne University 
of Technology in Melbourne 


How does Australia’s geographical 
distance from Europe and the United 
States shape its research? 
Distance is not a barrier. We 
build research alliances and new 
opportunities for our researchers and 
students through our partnerships 
and shared quest for knowledge. For 
example, our astronomy students in 
Melbourne have access to the most 
powerful telescope in the world, at the 
Keck Observatory in Hawaii, through 
both remote access and site visits. This 
has allowed our astrophysics team to be 
ranked among the best in the world. 
Our researchers routinely collaborate 
with other leading scientists and 
academic leaders from around the world 
through participating in conferences, 
collaborative research projects and joint 
initiatives. Our digital connections keep 
us engaged, and we think in a global way 
to ensure the work we do has impact. 


And what about collaborations within 
the Asia-Pacific region? 

Australia is exceptionally well located to 
work with our academic and research 
partners in the Asia-Pacific region. We 
can undertake research that addresses 
issues of mutual concern, such as 
energy, food security, manufacturing, cli- 
mate change and caring for older people. 


What is it about Australia that made 
you stay? 

When | first moved to this country 
from Canada 20 years ago to pursue 

a career in palliative-care research, | 
found Australia to be a resilient country 
that welcomes people with aspirations 
and a readiness to engage and reach 
high. Quality of life, an intellectually 
stimulating environment, and people’s 
willingness to embrace diversity, adopt 
new technologies and build innovation 
capability make Australia an attractive 
place to pursue a scientific career. K.M. 


This interview has been edited for 
length and clarity. 
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WHERE TO WORK 


The top ten institutions in Australia, based on research output included in the 2015 


Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count (WFC), 


a measure of the relative contribution of an author to an article weighted to correct 
for imbalances between subjects. Bars are divided according to the proportion that 
each subject area contributes to the overall score. 


@ Chemistry 
@ Earth and 
environmental 
sciences 
© Life sciences 
@ Physics 
He mplayer” 
#* Cochlear 
eon | ° 
OO | 
Pil _ @@ ues — 
SO pinoerei 
SALARIES 


Postdocs in Australia are the highest paid in the Asia-Pacific countries 
profiled, according to data collected in Nature’s interviews. 
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COLLABORATIONS 


Australia’s average collaboration score (top) — the sum of Nature Index's 
fractional count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries Australia collaborates with. 
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RESEARCH FOCUS 

The opening of the Sydney Nanoscience Hub has provided a 
boost to Australia’s already rising international reputation for 
nanoscience research. 

Unveiled in April, the Aus$150 million (US$112 million) precision 
built and engineered facility took six years to design and construct. 
It is the centrepiece of the new Australian Institute for Nanoscale 
Science and Technology, based at the University of Sydney. 

Nanoscience is the study of nanoscale materials (a nanometre 
is one billionth of a metre) and involves working with electrons, 
molecules, atoms and photons. Experiments can sometimes last 
only trillionths of a second. Because of this, research needs to 
be conducted in precisely controlled environments. The Sydney 
facility meets this demand by offering researchers laboratory 
space designed to minimize possible disturbances such as dust, 
pressure and temperature fluctuations, mechanical vibrations and 
electromagnetic radiation. 

One of the Hub’s three initial flagship projects focuses on optical 
physics and nanoscale photonics — hardly surprising when you 
consider that photonics (the study and application of light) is used 
in communications, imaging and data storage, underpinning a 
multi-trillion-dollar industry. One team, led by Benjamin Eggleton, 
the director of the Institute of Photonics and Optical Science at 
the University of Sydney, is attempting to develop a photonic chip 
that’s faster and more energy-efficient at gathering and processing 
information than conventional electronic devices. m 


The 
University 
of Sydney is 
home to the 
new Sydney 
Nanoscience 
Hub. 
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> consistent, scientific advances. “It’s like a bub- 
ble that pops and a great innovation comes out, 
then it recedes for a while,” he says. 

Partly because of its small population, Aus- 
tralia has developed a number of specialisms, 
on which it focuses its efforts and resources. 
Health and medical research, for instance, has 
brought the country seven Nobel prizes. Physics 
and chemistry researchers have claimed three. 


"AUSTRALIANS ARE NOT AFRAID 
OF TRYING NEW THINGS.” 


Astronomer Brian Schmidt shared the 2011 
Nobel Prize in Physics for the discovery of the 
accelerating expansion of the Universe through 
observations of distant supernovae. US-born 
Schmidt, who is vice chancellor of the Australian 
National University in Canberra, isamong many 
advocates who have spoken out about boosting 
the country’s science, technology, engineering 
and mathematics (STEM) capacities. 

There is certainly room for improvement. In 
2012, for example, just 0.4% of those entering ter- 
tiary education studied mathematics, compared 
with the 1% average reported for countries in the 
Organisation for Economic Co-operation and 
Development. The lack of home-grown STEM 
talent provides opportunities for postdoctoral 
and PhD applicants from other countries. 


ARN 


Although overall research and development 
expenditure as a proportion of gross domes- 
tic product (GDP) has remained constant at 
a little over 2% in recent years, the funding of 
specific subjects has fluctuated as a result of 
changing political priorities. For example, 75 cli- 
mate-science positions at CSIRO will be lost this 
year as a result of successive government cuts 
to the organization's budget in 2014 and 2015. 

The geographical distance to international 
centres in the Northern Hemisphere can put 
some people off relocating to Australia. Sydney 
is around a 20-hour fight from either London or 
New York, for example. But many view Austral- 
ia’s remoteness as an advantage. 

“Being a little distant from the mainstream, 
youre not influenced so much by what everyone 
else is doing; sometimes it means more space to 
pursue your own research interests,’ says Kelan 
Chen, a postdoc at the Walter and Eliza Hall 
Institute of Medical Research in Melbourne. 
Chen came to Melbourne from China to do her 
undergraduate degree and is now investigating 
an epigenetic regulator implicated in a form of 
muscular dystrophy. 

A five-hour flight away in Perth, Sophie 
Monnier, a French PhD student studing explo- 
ration geophysics at the University of Western 
Australia, agrees. “Australians are not afraid of 
trying new things,” she says, “because they are 
not hindered by historical cultural values” m 


AUSTRALIA'S ISOLATED LOCATION MEANS THAT 
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ENTRY REQUIREMENTS 

¢ Most researchers coming to Australia use 
the Temporary Work (Skilled) 457 visa, which 
allows skilled workers with dependents to 
work in their chosen field for up to four years. 
Applicants may be sponsored by an employer 
or considered a good fit for an open position. 
The visa costs start at Aus$1,060 (US$800) 
and applications are usually processed within 
90 days. 

¢ Another option is the Skilled Independent 
189 visa. This is aimed at English-speaking 
applicants under the age of 50 who are 
qualified to fill positions relevant to the 
country’s skill shortage list, but who are not 
sponsored by an employer. The price starts at 
Aus$3,600 (US$2,587) and the visa is assessed 
under a points-based system. 


OPPORTUNITIES & CONTACTS 

e International Postgraduate Research 
Scholarships provide tuition fees and health 
cover for two years for master’s degrees and 
three years for doctoral degrees. 

¢ Commonwealth Scientific and Industrial 
Research Organisation (CSIRO) Postdoctoral 
Fellowships are for PhD graduates with no 
more than three years of postdoc experience. 
e The National Health and Medical Research 
Council offers five-year fellowships for health 
and medical researchers. 

* Forrest Research Foundation Scholarships 
are available to students who want to study 
towards a PhD at Curtin University, Edith 
Cowan University, Murdoch University, 
Notre Dame University or the University of 
Western Australia. 
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LARRY MARSHALL 

Chief executive of the Commonwealth 
Scientific and Industrial Research 
Organisation (CSIRO) in Canberra 


How do you think the way science is 
done in Australia could improve? 
There is a lot of great science being 
done and investment in science is 
quite high. But we need to get better 
at ensuring that more of this research 
yields innovations that have social, 
environmental and economic impact. 


What are the forces shaping the 
research landscape? 

There is too much ‘science push’ rather 
than ‘market pull’. The science is being 
pushed out from universities and from 
organizations such as CSIRO. The 
approach is ‘This is what we have done. 
What can you do with it?’ We need to 
recognize that our national challenges 
should be shaping our priorities. That 
will be a difficult transition, but it is one 
that this country needs to make. 


In what areas is Australia looking to 
recruit expertise from overseas? 
One focus is on digital technology 
and big data. We could achieve 

huge productivity gains in precision 
agriculture, and our health services 
are ripe for disruption. We also need 
to increase our ability to understand 
the impacts of ‘blue-economy’ 
developments, such as new fisheries. 
Other groups are looking at ways to 
obtain information in real time from 
living biological tissue by embedding 
sensors for diagnosis and intervention. 


Does the Australian public value 
scientific research? 

There is huge support for science in 
Australia because people recognize 
how much it has contributed to their 
day-to-day lives. Despite this, there is 
not much knowledge of the way that 
scientists work. We need to get better 
at explaining the scientific process in a 
way that relates to most people. K.M. 


This interview has been edited for 
length and clarity. 
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DRC-Hubo, the humanoid robot that won its creators from KAIST a US robotics prize. 


SOUTH KOREA 


A big investor in research and development, South Korea is 
attracting top scientists in the hope of boosting basic science. 


BY MARK ZASTROW 


hen computer scientist Francois 
Rameau made the decision to pur- 
sue his career abroad, he did it more 


quickly than most. After looking up the research 
output of the robotics and computer-vision labo- 
ratory at the Korea Advanced Institute of Science 
and Technology (KAIST) in Daejeon, South 
Korea, where he had been offered a position, he 
instantly rejected the three postdoctoral fellow- 
ships on offer back home in France. “I realized 
it's probably one of the best in the world, he says. 
“T went without hesitation” 

Rameau is now working on driverless cars — 
one of the hottest research fields in computer 
science. He hasa five-year fellowship from the 
National Research Foundation (NRF) of Korea 
and additional funding from the German man- 
ufacturer Bosch. Since Rameau arrived, the 
lab’s reputation has continued to grow: in 2015, 
a group of his colleagues won the US Defense 
Advanced Research Projects Agency Robotics 
Challenge with DRC-Hubo, a humanoid robot 
that can drive a car, climb stairs and open doors. 

Robotics is one of the many fields that have 
benefitted from South Korea's focus on research 
and development (R&D) since the start of the 
century. In 2014, the country spent 4.3% of its 
gross domestic product (GDP) on R&D — the 
highest globally, and twice what it spent in 1999. 

This investment has been partly spent on 
recruiting high-level international scientists 
to South Korean institutions. But these have 
often been joint appointments, under which the 
researchers work at the institutions for only a 
few months of the year. Although this approach 


may have lent prestige, it is widely viewed to have 
failed to improve research quality. As a result, 
many universities are now moving towards 
offering more long term, full-time appointments. 

The Institute for Basic Science (IBS), a net- 
work of university-based research centres estab- 
lished in 2011, does not offer joint appointments. 
“Foreign recruits have to resign from their previ- 
ous position and move here,” says Doochul Kim, 
president of IBS, headquartered in Daejeon. 

Efforts to attract talented scientists to South 
Korea are partly driven by concern over the 
country’s low birth rate, and fears ofa looming 
shortfall in the number of students taking up 
university places. The government hopes to 
double the number of foreign students at South 
Korean universities to 200,000 by 2023. It has 
also created a visa for entrepreneurs and offers 
citizenship to those who studied in Korea and 
have since been in employment for two years. 

Language can present hurdles for foreign 
researchers — calls for grant proposals are often 
only in Korean, for example. And there are few 
international schools outside of the capital. This 
is why, instead of living close to his lab at KAIST, 
Turkish chemist Cafer Yavuz, and his wife and 
son, live in Seoul. He makes the three-hour com- 
mute by bus to KAIST twice a week. “It’s a work- 
ing solution, but it’s not ideal,” he says. 

Rameau, who doesn't have children, has no 
such worries. He is considering looking for an 
academic position in South Korea when his fel- 
lowship ends. He says those moving to the coun- 
try might find that it takes a while to get their 
bearings. “Don't judge too early,’ he says. “You 
need to stay for more than six months to start 
to understand the state of mind of the people. m 


$22 | NATUREJOBS CAREER GUIDE | ASIA-PACIFIC 2016 
© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


DOD PHOTO/ALAMY 


BYUNG GWON LEE 
President of the Korea Institute of 
Science and Technology (KIST) 


How easy is it to get a job at a South 
Korean institution? 

The job market is getting tougher, but the 
country’s increasing globalization offers 
great opportunities for foreign scientists. 


What advice do you have for scientists 
hoping to work in South Korea? 
Without learning Korean, foreign scien- 
tists cannot compete for R&D funding. 
For those who find work, | strongly advise 
that they broaden their professional and 
personal network. Tight social and busi- 
ness circles can limit opportunities, but 
Koreans are warm-hearted, and Korean 
scientists in particular are open-minded 
to scientists from abroad. 


Whatare the relative benefits of working 
for government institutes, universities 
or industry? 

Industry is well funded, but focuses on 
the profit motive to the extent that your 
project is continuously under review and 
can be terminated quickly. Government 
institutions also provide good resources, 
but it can be difficult to find high-quality 
students to fill positions, which can make 
it difficult to make progress. Academic 
institutions have access to very good stu- 
dents, but funding is sometimes limited. 


How does South Korea’s famously hier- 
archical work culture affect scientists? 
Local researchers are very respectful 
towards principal investigators and 
senior researchers. Foreign researchers 
have a bit more freedom. They can freely 
walk out of the lab, whereas Koreans 
typically ask permission if they want to 
leave before the principal investigator. 


What is the best thing about working 
in South Korea? 

It’s the Koreans. They are highly 
educated, and have strong characters 
and great energy. M.Z. 


This interview has been edited for 
length and clarity. 
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The top ten institutions in South Korea, based on research output included in the 
2015 Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count 
(WFC), a measure of the relative contribution of an author to an article weighted to 
correct for imbalances between subjects. Bars are divided according to the 
proportion that each subject area contributes to the overall score. 
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Compared with researchers elsewhere in the Asia-Pacific region, postdocs are 
relatively poorly paid and professors relatively well paid in South Korea, according 
to data collected in Nature’s interviews. 
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COLLABORATIONS 

South Korea’s average collaboration score (top) — the sum of Nature Index’s 
fractional count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries South Korea collaborates with. 
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RESEARCH FOCUS 
The hunt for dark matter — which is thought to make up around 
27% of the Universe — could come to an end in South Korea. 

One possible component of dark matter is a theoretical 
particle called the axion. In 2017, physicists at the Center 
for Axion and Precision Physics Research (CAPP) in Daejeon 
expect to complete the development of a detector that could 
find this particle. CAPP director Yannis Semertzidis also hopes 
to build an experiment to detect the electric dipole moment 
of the proton, which if found could help to explain why there is 
more matter than antimatter in the Universe. 

CAPP is one of 26 research centres that make up the Institute 
for Basic Science (IBS), a national institution founded in 2011 
to bolster blue-sky research. The institute’s task is to rebalance 
the country’s scientific efforts to include more basic science 
alongside the previous focus on applied research, which was 
deemed to offer more immediate economic benefits. Other 
centres in the network focus on different fields, including laser 
science, gene editing and nanomedicine. 

The IBS has plans to open further research centres that 
specialize in areas of mathematics, environmental science, 
theoretical and nuclear physics, optics, new materials and 
the use of nanotechnology in biomedicine. It plans to recruit 
scientists to serve as directors and associate directors of these, 
and offer fellowships for young scientists, in 2017. 


OPPORTUNITIES & CONTACTS 

* The Korean Government Scholarship Program offers 
scholarships to international graduates or undergraduates, 
consisting of one year of Korean instruction, followed by up to six 
years of study. 

* The International R&D Academy at Korea Institute of Science 
and Technology trains scientists and engineers from developing 
countries and offers PhD programmes through the affiliated 
University of Science and Technology. 

* The government-run Brain Korea 21 Program for Leading 
Universities and Students (BK21 PLUS) initiative funds professor 
and postdoc positions at various research universities. m 
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The Indian Space Research Organisation launch a rocket carring a navigational satellite. 


INDIA 


Although not amajor scientific player, India hopes that 
attracting foreign researchers will help it achieve its ambitions. 


BY T.V. PADMA 


fast-developing research landscape,” 

says biophysicist Darius Koster. “Com- 
ing here provides an opportunity to bring in your 
own ideas and be part of this change” 

Originally from Germany, Koster moved 
to India in 2011 to study the mechanisms of 
cell-membrane organization as a postdoc at the 
National Centre for Biological Sciences (NCBS) 
in Bangalore. As a foreign scientist working in 
the country, he is part of a small minority: in the 
past, Indian institutions have not tried to attract 
researchers from abroad. 

India is not a major global scientific player. 
It has few scientists relative to its population, 
and many of its most talented researchers move 
abroad to work. The country invests less than 
0.9% of its gross domestic product in research 
and development. Many think it needs to 
embrace experience from outside the country to 
achieve its ambitions. “We can no longer afford to 
have a frog-in-the-well attitude,” says astrophysi- 
cist Tarun Souradeep at the Inter-University Cen- 
tre for Astronomy and Astrophysics (IUCAA) 
in Pune. “Foreign scientists bring a diversity of 
research cultures and values.” 

A few institutions have begun to hire foreign 
postdocs and faculty members. These include 
the Tata Institute of Fundamental Research in 
Mumbai, the Indian Institutes of Technology, 
the Indian Institutes of Science Education and 
Research, the IUCAA and the NCBS. 


Ti is a dynamic place with a 


India has big ambitions. In 2015, the govern- 
ment announced a plan to become a leading 
nation in terms of computing power by creating 
anetwork of around 70 supercomputers to con- 
nect the nation’s academic and research institu- 
tions. It also revealed a five-year strategy to turn 
the country into a global genomics hub. And 
in 2018, the Indian Space Research Organisa- 
tion plans to launch Chandrayaan-2, its second 
lunar mission, consisting of an orbiter, lander 
and rover. 

The low price of goods and services relative 
to average salaries in India mean that research- 
ers can enjoy a good standard of living. The 
average salary in both Pune and Bangalore 
buys more goods and services of an equivalent 
standard than the average wage in New York 
City, according to the crowdsourced database 
Numbeo. However, publicly-funded institutions 
do not contribute to pension schemes for foreign 
scientists, as they do for local employees. 

Life in India can be challenging. “We knew 
that India would be hot, but nothing prepared us 
for our first summer here,’ says physicist Richard 
Morris, whose 2015 move from the United King- 
dom with his wife and two daughters to work 
at the NCBS came shortly before a severe heat 
wave and drought. Axel Brockmann, a German 
honeybee specialist at NCBS who took an assis- 
tant-professor position in 2012, found it difficult 
to get used to the cows that roam freely in the 
road and to different attitudes to time-keeping 
in some quarters. “Here, 5 minutes can mean 
anything from 15 minutes upwards,’ he jokes. m 
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SATYAJIT MAYOR 
Director of the National Centre for 
Biological Sciences (NCBS) in Bangalore 


What are the attractions of doing 
research in India? 

India does not have many postdocs, so 
there is the potential, in this small system, 
for pursuing one’s own vision. Ata time 
when research resources are plateauing 
in most parts of the world, India has an 
opportunity to invest more — much as 
China and Japan are doing. 


Are there any downsides for researchers 
coming to the country? 

The notorious Indian bureaucracy is the 
country’s Achilles heel. For foreign scien- 
tists, getting permission to work in India 
can be complicated. The slow delivery 

of funds from funding agencies and the 
creaking research infrastructure can also 
be an issue. And high-quality research is 
mostly confined to a few top institutes. 


How has the NCBS attracted scientists? 
The NCBS has an open structure, 
without rigid departmental boundaries. 
It works on biology at every scale — 
from molecules to the ecosystem 
—and has a strong interdisciplinary, 
multidimensional research culture. 

A good example is its programme on 
chemical ecology, which is the study of 
chemicals involved in the interactions 
within and between organisms. Itis a 
unique opportunity to study uncharted 
territory in biology. 


What opportunities does India offer? 
As India’s small scientific community 
grows, there is enormous scope to chart 
new directions in research across all 
disciplines. The Indian subcontinent has 
unique geological systems that reflect 
rare changes in Earth’s geological history. 
It also has a diverse human genetic 

pool and is teeming with unexplored 
ecological niches — all of which offer 
great research opportunities. T.V.P. 


This interview has been edited for 
length and clarity. 


NATIONAL CENTRE FOR BIOLOGICAL SCIENCES 


WHERE TO WORK 

The top ten institutions in India, based on research output included in the 2015 
Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count (WFC), 
a measure of the relative contribution of an author to an article weighted to correct 
for imbalances between subjects. Bars are divided according to the proportion that 
each subject area contributes to the overall score. 
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Starting salaries in India are the lowest of those in the Asia-Pacific countries 
profiled, according to data collected in Nature’s interviews*. 
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COLLABORATIONS 

India’s average collaboration score (top) — the sum of Nature Index’s fractional 
count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries India collaborates with. 
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RESEARCH FOCUS 

Ripples in space-time called gravitational waves were detected for 
the first time in 2015 — around 100 years after Albert Einstein 
predicted them in his general theory of relativity. Astronomers 
hope that the observations, made by twin Laser Interferometer 
Gravitational-Wave Observatory (LIGO) detectors in the United 
States, will provide a better understanding of gravity and answer 
other major questions about the Universe. 

In February, India signalled its growing ambitions in 
astronomy by agreeing to build a third LIGO observatory, costing 
US$183 million. Alongside its US counterparts, LIGO-India will 
allow astronomers to pinpoint the source of the cataclysmic 
cosmic events that cause gravitational waves — such as the 
collisions of massive stars or black-hole mergers — much 
more accurately. The Inter-University Centre for Astronomy 
and Astrophysics has been advertising for postdocs and faculty 
members to support this research. 

In 2015, India formally joined an international collaboration 
seeking to build the Square Kilometre Array (SKA), which would 
be the world’s largest radio telescope, at sites in Australia and 
South Africa. It is leading the part of the consortium dedicated 
to the development of the hardware and software needed to 
control SKA. India has also emerged as a possible contender 
to host the Thirty Metre Telescope, the world’s most advanced 
optical and near infrared telescope, following protests over the 
original plan to build on land in Hawaii considered sacred by 
native people. 


OPPORTUNITIES & CONTACTS 

* The Wellcome Trust/DBT India Alliance — jointly funded by the 
UK Wellcome Trust and India’s Department of Biotechnology 
— offers Early Career Fellowships for postdoctoral biomedical 
researchers who want to develop their research careers in India. 
Intermediate and senior fellowships are also available. 

* The Human Frontier Science Program, based in Strasbourg, 
France, offers postdoctoral fellowships to those who want to 
pursue basic life-sciences research abroad, including in India, in 
a field other than their own. m 
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Singaporeans test a self-driving vehicle, technology that may soon be part of the transport system. 


SINGAPORE 


The government is spending more than ever before on research and 


development, especially on work thatis likely to produce returns. 


BY TOM BENNER 


global economic slowdown might seem 
A® an unlikely driver of professional 

advancement for a scientist. Yet that is 
exactly what the financial crisis of 2007-08 was 
for French immunologist Florent Ginhoux. He 
began to consider working in Asia when the 
crash made it difficult to find an institution in the 
West that was willing to fund his plans to set up 
his own lab. In 2009, he moved his family to Sin- 
gapore, where he becamea principal investigator. 
He now leads a lab of 12 researchers and students, 
and has no intention of leaving any time soon. 

“In Europe or the US, sometimes when 
you start your lab you have to think small and 
slowly expand,’ says Ginhoux, who is studying 
the biology of dendritic cells with his group at 
the Singapore Immunology Network (SIgN), a 
research institute that is part of the Agency for 
Science, Technology and Research (A*STAR). 
“Singapore was the only place where I could 
really think about doing the science with all the 
resources and support I needed” 

The island city-state welcomes foreign talent 
such as Ginhoux as part of the government's 
commitment to promote research, innovation 
and enterprise as the cornerstones of its econ- 
omy. A wealthy country, Singapore spends 
2.2% of its gross domestic product on research 
and development (R&D). Its per capita R&D 
spending is among the highest in the world. In 
January, the Singaporean government unveiled 
a 5-year, $$19-billion (US$13.9-billion) plan to 


support R&D in the country — an 18% increase 
on 2011-15. 

Up to S$4 billion of this will be invested in 
research collaborations between industry and 
academia, reflecting a broader trend in the way 
science is done in Singapore. In October 2015, 
for example, US-based engineering company 
Applied Materials established a joint R&D lab 
with A*STAR to develop advanced semiconduc- 
tors for logicand memory chips. 

Although government funding is availa- 
ble for basic research, an increasing amount 
is being directed towards work that is seen as 
most likely to have an economic impact. “There 
are still significant resources available for basic 
research at A*STAR, but there is a tangible push 
towards work that has a more immediate return 
on investment,’ says French geneticist Bruno 
Reversade, who joined Singapore's Institute of 
Medical Biology in 2008. He now leads a group 
that investigates the genetics of rare diseases and 
how twins are produced. 

Renting property in Singapore is notoriously 
expensive, but the average salary there buys 
more goods and services of an equivalent stand- 
ard than the average wage in London, according 
to crowdsourced website Numbeo. 

Compared with neighbouring countries, 
western researchers may find Singapore easier 
to settle in. English is widely used, and there are 
plenty of international associations and schools 
for expats and their families. Ginhoux needed 
to do little to prepare for his move: “Singapore is 
open and foreigner-friendly,’ he says. m 
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STEVEN BERNASEK 
Dean of Faculty at Yale-NUS College 


What has surprised you most about 
working in Singapore? 

The biggest surprise is how quickly 
the landscape can change, and not 
just physically, in terms of buildings 
and infrastructure. Decisions to 
support particular areas of science 
or technology development are 
translated into financial support and 
research infrastructure much more 
quickly than they are in the United 
States or Europe. 


What are the challenges and 
opportunities for foreign scientists? 
Keeping up with the pace of change 
can be difficult, as can the need 

to rely on international suppliers 

for laboratory equipment and 
supplies. But many researchers are 
excited about doing science in an 
environment less constrained by 
resources than they may have been 
used to. There are also opportunities 
to work in areas of fundamental 
science that have clear connections to 
problems in society. 


Should foreigners be aware of any 
differences in working cultures? 
Singaporean researchers can 
sometimes be more deferential and 
less willing to challenge the ideas 
and plans of their team leader than 
those from elsewhere. There can be 
a tendency to work strictly to a job 
description, rather than thinking 
about ways to actually accomplish 
tasks and suggesting alternative ways 
to go about things. 


What is the best thing about working 
in the country? 

It’s the optimism about science and the 
support for science. It’s such a contrast 
to doing basic science in the United 
States. T.B. 


This interview has been edited for 
length and clarity. 


YALE-NUS COLLEGE 


WHERE TO WORK 


The top ten institutions in Singapore, based on research output included in the 2015 
Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count (WFC), 


a measure of the relative contribution of an author to an article weighted to correct 
for imbalances between subjects. Bars are divided according to the proportion that 
each subject area contributes to the overall score. 
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Starting salaries in Singapore are the highest among the Asia-Pacific 
countries profiled for professors, but less generous for those lower down the 
scale, according to data collected in Nature’s interviews. 
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COLLABORATIONS 


Singapore’s average collaboration score (top) — the sum of Nature Index’s 
fractional count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries Singapore collaborates with. 
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RESEARCH FOCUS 

Imagine cars that know where empty parking spaces are; buildings 
that can identify sources of air pollution; and smartphone apps that 
alert users when someone nearby has a heart attack, and locates 
the closest defibrillator. Trend watchers have long predicted the 
advent of networks of connected sensors embedded in the physical 
objects around us, seamlessly collecting and exchanging data to 
enable a wide variety of innovations. 

Many cities are experimenting with such technologies to 
improve traffic management or street lighting, for example. The 
Singaporean government’s Smart Nation initiative, launched in 
2014, outlines a broader, bolder vision of the Internet of Things. 
One early example is Beeline, an app that analyses historical 
travel patterns and crowdsourced data to suggest routes to bus 
operators. Smart Nation is creating systems that combine data 
from sensors such as those in smartphones, other sensors and 
surveillance cameras to generate a live picture of the nation in 
unprecedented detail. The initiative’s fellowship programme 
offers funding to computer and data scientists with ideas for 
applications that would allow for better management of traffic 
congestion, power outages, floods and infectious-disease 
outbreaks, for example. The Singaporean government hopes that 
the country’s compact size and centralized political system will 
allow it to become a world leader in the field. But given its history 
of authoritarianism, some are sceptical about the government’s 
promise to anonymize the new data it will collect. 


OPPORTUNITIES & CONTACTS 

* The A*STAR International Fellowship is a two-year fellowship for 
postdocs who have completed their PhD in the past four years 
and final-year PhD students. 

* The Lee Kuan Yew Postdoctoral Fellowship is a three-year 
fellowship at Nanyang Technological University (NTU) and the 
National University of Singapore, to be held concurrently with a 
staff research position. It is open to foreign students. 

* NTU assistant professorships are tenure-track positions with 
attractive remuneration and start-up grant packages for foreign 
researchers who have obtained a PhD in the past ten years. m 


NATUREJOBS CAREER GUIDE | ASIA-PACIFIC 2016 | $29 


© 2016 Macmillan Publishers Limited, part of Springer Nature. All rights reserved. 


Jason Johnston works on automated monitoring of stored fresh produce at Plant and Food Research. 


NEW ZEALAND 


Asmall science community offers opportunities in a dramatic 
landscape, but can also limit career progression. 


BY ANNABEL MCGILVRAY 


hen mathematician Alys Clark left 
Wie United Kingdom to do her mas- 

ter’s degree in Australia, she only 
intended to be gone for a year. She ended up 
following her course with a PhD, before moving 
“across the ditch” to take up a job in New Zea- 
land. Seven years after leaving home, she’ still 
there, living with her New Zealand husband and 
their young son amid ferns and kauri trees, just a 
short drive from the black sands of the country’s 
west-coast beaches. 

“Tt's pretty amazing, in that you can drive for 
half an hour and you are on the beach and not 
far away are the mountains,’ says Clark, who 
uses mathematics and physics to create virtual 
organs and identify early pathological changes 
at the University of Auckland. In 2014, Clark 
was awarded a five-year Rutherford Discovery 
Fellowship, which she is now using to mathe- 
matically model the physical processes involved 
in early pregnancy. 

Beyond the natural beauty on her doorstep, 
Clark feels at home in what she describes as a 
collegial, informal and inclusive research culture. 
“The senior academics are very approachable and 
supportive of researchers going off and setting up 
their own projects. The whole way of life is a little 
more relaxed than back home in the UK” 

New Zealand has a population of less than 
5 million and spends just 1.2% of its gross 
domestic product on research and development 
— half the international average, according to 
the Organisation for Economic Co-opera- 
tion and Development. As a result, the science 
community is small and the scope for career 


progression is limited, particularly in academia. 
“Tt’s not always an easy environment to stay in if 
you want a long-term career,’ says Clark. 

But for others, the small pool has advantages. 
Having previously worked in his native Germany, 
as well as Switzerland and the United Kingdom, 
meteorologist Olaf Morgenstern moved to New 
Zealand to continue his climate-modelling 
research in one of the field’s most challenging 
environments. Before the move, a colleague told 
him that whereas in larger communities he may 
be a small fish in a big pond, in New Zealand, 
“the pond is all yours” Indeed, since arriving to 
take a position at the National Institute of Water 
and Atmospheric Research in 2008, Morgenstern 
has become something of a big fish: last year he 
took on a job in Wellington, leading the Earth 
system modelling and prediction programme for 
the Deep South National Science Challenge, the 
country’s Antarctic climate-science endeavour. 

A growing interest in climate science and 
modelling in New Zealand in recent years fits 
well with the country’s conventional strengths 
in Earth and environmental sciences. In the 
Asia-Pacific region, New Zealand leads the field, 
according to the Nature Index, which assesses 
research performance on the basis of contribu- 
tions to high-quality publications. Agricultural 
research is a particular strength, driven by both 
government-owned research institutes and large 
corporations, such as the multinational dairy 
cooperative Fonterra. Although New Zealand's 
dramatic landscape is central to its appeal to 
many international researchers, for others, such 
as Morgenstern, the country’s beauty is just an 
attractive backdrop to the scientific challenges 
and opportunities that it provides. = 
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PLANT & FOOD RESEARCH 


BRUCE CAMPBELL 

Chief operating officer at Plant and 
Food Research, headquartered in 
Auckland 


What aspects of working life in 

New Zealand make it appealing? 

You can pursue an international- 
standard career, while also having 

a good work-life balance. There’s 
recognition of the importance of family 
and a collegial work environment. 


What distinguishes the country’s 
research culture? 

Plant and Food Research is one of seven 
government-owned Crown Research 
Institutes. Organizations such as ours 
have strong links with universities and 
industry groups, with secondments 
happening both ways, providing 
opportunities for diverse experiences. 
Because we are a small country, there 
are good opportunities for researchers 
to connect to key decision-makers in 
government and influence science- 
related policies. Science in New Zealand 
is currently undergoing reform to make 
the system much more collaborative 
between stakeholders such as research 
institutions, industry and government. 


Does New Zealand’s location make 
international collaborations difficult? 
When you’re working with researchers 
in other countries, there can bea 
perception that you're a long way 
away. We work hard to break down 
that barrier with modern digital 
communications. And scientists 

here probably spend more time on 
aeroplanes than those in other parts 
of the world. But it’s only an overnight 
flight to the western United States or 
around 24 hours to Europe. Isolation 
can also stimulate innovation, and the 
physical environment in New Zealand 
is conducive to thinking about different 
ways of doing things, which is a great 
starting point for science. A.M. 


This interview has been edited for 
length and clarity. 


PLANT & FOOD RESEARCH 


WHERE TO WORK 


The top ten institutions in New Zealand, based on research output included in the 
2015 Nature Index, May 1 2015-April 30 2016, shown as weighted fractional count 
(WFC), a measure of the relative contribution of an author to an article weighted to 
correct for imbalances between subjects. Bars are divided according to the 
proportion that each subject area contributes to the overall score. 
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The relative gap between the best and worst paid in New Zealand is the smallest of those 


in the Asia-Pacific countries profiled, according to data collected in Nature’s interviews. 
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COLLABORATIONS 

New Zealand’s average collaboration score (top) — the sum of Nature Index’s 
fractional count (the relative contribution of authors to an article) for international 
collaborations divided by the number of countries New Zealand collaborates with. 
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Overlaps in subject areas may cause some distortion to relative subject proportions. 


RESEARCH FOCUS 

Perched on the edge of the Southern Ocean, New Zealand is an 
ideal base from which to assess the roles that the Antarctic and its 
surrounding waters have in climate change. 

Motivated by the most recent report by the Intergovernmental 
Panel on Climate Change — which says that existing international 
climate models do not accurately account for the unusual cloud- 
cover patterns caused by the low-pressure systems that dominate 
in the Antarctic — the New Zealand government has established 
the Deep South National Science Challenge. The multidisciplinary 
collaboration’s mission is to better measure the changes occurring 
now, and more reliably predict what might happen in the future. 

Deep South researchers are creating their own modelling 
system based on ongoing observations, and sharing their data 
with international groups to improve global modelling. Their 
results will also inform policies about domestic issues that could 
be affected by climate change. 

Deep South is 1 of 11 National Science Challenges established 
by the New Zealand government to address issues of national 
and international significance. The others include Ageing Well, 
Sustainable Seas, and Building Better Homes, Towns, and Cities. 
The aim is to bring together publicly funded Crown Research 
Institutes, non-government bodies and business to tackle these 
challenges. The government is backing the plan with NZ$1.6 
billion (US$1.2 billion) over a decade. 


OPPORTUNITIES & CONTACTS 

* The government’s New Zealand Now service offers personalized 
e-mails about jobs opportunities, including those in research, as 
well as information on studying, working and living in the country. 

* New Zealand International Doctoral Research Scholarships 
provide university tuition fees, living expenses and medical 
insurance for three years for international students undertaking 
a PhD in the country. 

* The Li Lairong Horticultural Research Fellowship is available 
to Chinese students for research placements of up to three- 
month with Plant and Food Research, a government-owned 
institute. m 
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Aier School of Ophthalmology 
Open positions at ASO 


The Aier School of Ophthalmology (ASO), located in Changsha, the 
capital city of central China’s Hunan province, was established by 
Central South University and the Aier Eye Hospital Group (Aier), 
China’s largest eye care service provider. As the first specialized 
ophthalmic college in the country, ASO will focus on Aier’s long- 
term development strategy of basic and clinical research, and 
leverage its strong financial position and rich clinical resources 
from over 120 affiliated professional eye hospitals, to train the next 
generation of talented ophthalmic researchers and professionals. 


ASO is now inviting applications for the following positions: 

1. Associate Dean of ASO 

2. Vice Director of the Aier Research Institute of Ophthalmology 

3. Research Principal Investigators (PI) in the fields of stem cells 
and precision medicine 


Qualifications: 

« A doctoral degree in medical science 

¢ Prior employment as an associate professor or more senior level 

e Demonstrated strong scientific research ability, with more than 
10 publications in leading, peer-reviewed international journals 

e Atrack record of securing external funding in overseas countries 

e Excellent communication and organizational skills as well as a 
strong sense of responsibility 


Be AB MR BL 


AIER EYE HOSPITAL 


= = 
Salaries and benefits: 


e Annual salary: 500,000-1,000,000 RMB (around 80,000- 

150,000 USD) 

Research funds as needed 

2-3 research assistants (including PhD students) 

Accommodation for a transition period after relocation 

An offer of Aier share options 

An offer of Aier partnership program options 

Reimbursement of travel expenses for domestic and international 

academic meetings 

¢ Incentives for publications in high-level journals 

¢ Incentives for science and technology achievements 

¢ Reimbursement of overseas travel expenses for up to two 
family-reunion trips a year 

e Basic insurance and other fringe benefits in accordance with 
national regulations 

e« Recommendation and support for applications to various kinds of 
talent programs and funds 


To apply: 

Please send a curriculum vitae and a brief cover letter stating your 
career and research interests to Shibo Tang, professor and dean of 
ASO, Central South University at tangshibo@vip.163.com. 


Beijing Advanced Innovation Center for Brain Disorders (BAIC-BD) 


Recruiting Principal Investigators 


China Capital Medical University (CCMU) has the strongest comprehensive advantages 
in neuroscience in China and is the owner of the largest clinical resource of major 
brain diseases, epidemiology and epidemiological bio-bank. CCMU affiliated hospitals, 
such as Xuanwu Hospital, Tiantan Hospital and Sanbo Brain Hospital, specializing in 
major brain diseases are considered first rate in China and highly recognized in the 
world. CCMU and its affiliated hospital receive the nation’s largest amount of patients 
with brain diseases, especially brain related hard-to-treat illness. CCMU equipped with 
abundant and solid research resources and platforms for neuroscience development, 
operates one National Clinical Disease Center for neurological and mental diseases, 
three National Key Disciplines, five National Key Specialties, two national centers for 
quality control, one Key Laboratory at provincial level, one Key Laboratory sponsored 
by the Chinese Ministry of Education and 11 Beijing municipal government sponsored 
Key Laboratories and Engineering Research Centers. 

The Beijing Institute for Brain Disorders (BIBD) at CCMU, funded by the Beijing Municipa 
Government, is a research center aimed at understanding the mechanisms underlying 
brain disorders, and at translating results from the laboratory bench to the patient’s 
bedside, in order to reduce the burdens that the disorders impose on patients, families, 
and society. As the preponderant subject carrier of CCMU, BIBD carry out the overal 
planning layout of the neuroscience discipline, discipline construction, personne 
training and serve as the incubator of high level research & development. To achieve 
he major discoveries or inventions, 
orm the platform for the scientific 
research, production tansforming and 
personnel training, BIBD performs the 
dual track running mechanism through 
he combination of public institutions and 
enterprises. 
Beijing Advanced Innovation Center for 
Brain Disorders (BAIC-BD), established 
in 2015, is one of the first Advanced 
Innovation Centers sponsored by the 
Beijing municipal government. Hosted 
by the Capital Medical University and 
cooperated closely with BIBD, this center 
aims to advance the knowledge of brain 
diseases, translate science to treatment, 


and reduce the burdens of brain disorders. BAIC-BD strives to produce influential 
research discoveries and important healthcare products, establishing a world-class 
research center for the translational study of human brain diseases. 

BAIC-BD welcomes dedicated and talented scholars to join from China and abroad. 
With a well-developed support system, the center provides a modern and scholarly 
environment for innovative research and will offer incentives to encourage creativity 
and innovation. 


Responsibilities 

1. Carry out research and participate in all types of research activities, including 
building up and supervising research teams, designing and conducting experiments, 
publishing high quality research articles, filing patents, etc. 

2. Advance a specific mental health discipline, such as stroke, Alzheimer’s disease, 
Parkinson’s disease, brain tumor, depression, epilepsy, nerve injury and repair, and 
schizophrenia. 

3. Mentor junior faculty members, supervise graduate students, and advise 
professional and undergraduate students. 


Qualifications 

Applicants should have a Ph.D. or M.D. in Neuroscience or related disciplines, be able to 
develop independent research, with over 5 year oversea experience, and have at least 
held associate professorship (or equivalent) in world-renowned universities or research 
institutes. Priorities will be given to those with over 10 years’ experience in a relevant 
field of brain diseases and with high accomplishments. 


Application procedure 
Applicants should submit a curriculum vitae, 3-5 recommendation letters, copies of 
academic credentials and representative publications to liuyujia@ccmu.edu.cn and 
hwzheng@ccmu.edu.cn . 


Further Contact 

Address: Capital Medical University, Xitoutiao 10#, Youanmenwai, 
Fengtai District, Beijing 100069, China. 

Contact Person: Ms Liu, Ms Zheng 

Phone: +86-1083911208, +86-1083911894 

E-mail: liuyujia@ccmu.edu.cn hwzheng@ccmu.edu.cn 


Talents Wanted Programme 
Faculty search for the School of Environment, Beijing Normal University 


The School of Environment at Beijing Normal University is seeking 
faculty candidates under its newly launched and well-funded Talents 
Wanted Programme. All research fields covered by the School of En- 
vironment are eligible, including aquatic environments and ecologies, 
urban ecology, wetland habitats, basin environmental modification 
and ecological restoration, environmental engineering, hydrology and 
water resources, and environmental evaluation and management. 


Qualifications: 

Applicants should hold a PhD degree and are expected to be in their 
thirties or forties. Overseas postdoctoral experience will be considered 
an advantage. Applicants should have a demonstrated commitment to 
excellence in teaching and research at a level comparable to an assistant 
or associate professor in a top-tier international university. Successful 
candidates will be employed on a full-time basis and are expected to 
establish an internationally competitive, independent and cutting-edge 
research programme in a relevant field at the School of Environment. 


Benefits of working with us: 

® Successful candidates will be employed as principal investigators 
and will be able to supervise doctoral students. 

® The School of Environment will offer an internationally competi- 
tive salary and assist successful candidates in finding accommoda- 
tion on or around campus. 

® The School of Environment will provide office and laboratory 
space as well as internationally competitive research 
start-up packages. 


Application process: 

Qualified applicants are strongly encouraged to submit their applica- 
tions electronically to Ms. Jun Zhang (zj0207@bnu.edu.cn). 
Applications should include the following materials (in PDF format): 
® acomprehensive curriculum vitae; 

® a research plan and teaching plan (maximum two pages); 

¢ three to five references with contact information for the referees. 


For further information please contact: 

Ms. Jun Zhang 

Tel: +86 10 58800397; Fax: +86 10 58800397 
E-mail: zj0207@bnu.edu.cn 

Website: http://env.bnu.edu.cn 
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School of Environment, Beijing Normal University 


DukeNUS 


Medical School 


LEADING RESEARCH AND INNOVATION 
IN SINGAPORE AND ASIA 


Backed by world-renowned Duke University and the National University of Singapore, Duke-NUS Medical School is a young, 
vibrant institution that punches above its weight. 


Duke-NUS’ international reputation in biomedical research and medical education 

has grown exponentially since 2005. Furthermore, its partnership with the collective 

clinical strengths of Singapore's largest healthcare provider, SingHealth creates a 

dynamic scholarly community for new discoveries, learning and care innovations : 

which ultimately benefit patients. SingHealth DukeN US 
ACADEMIC MEDICAL CENTRE 

With over $$350 million clinched in research funding, numerous research awards ee 

won and more than 2,800 papers published, our numbers speak for themselves. 

Opportunities to do meaningful biomedical, translational or clinical research at 

Duke-NUS have never been better. 


Explore research career opportunities at www.duke-nus.edu.sg/careers or read about our myriad research 
programmes at www.duke-n us.edu.sg/resea rch 


FULL-TIME AND ADJUNCT FACULTY 
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Isotopic composition of plant water sources 


ARISING FROM J. Evaristo, S. Jasechko & J.J. McDonnell Nature 525, 91-94 (2015); doi:10.1038/nature14983 


In their study, Evaristo et al.‘ collected an extensive data set on the 
basis of which they statistically determined the isotopic compositions 
of the plant water source (6'’Ointersect ANd S"Hintersect Called respectively 
Os acieae and SF Hictercpt in their paper) as the x and y coordinates in 
(880, &H) space of the intersection between the local meteoric water 
line (LMWL) and the plant xylem water ‘evaporation line’ (EL) for a 
range of climates and vegetation types. Evaristo et al.'! showed that for 
80% of their referenced sampling sites, the mean value of the ground- 
water hydrogen isotopic composition (&’Hgw) was statistically differ- 
ent from &Hintersecs Supporting their hypothesis that the precipitation 
sources for groundwater recharge are different from the precipitation 
sources for plant water uptake. However, we consider that Evaristo 
et al.| made a mistake in their equation (2), such that their analysis of 
rainfall segregation between plant transpiration and groundwater is not 
valid. This result does however not bring into question the paper’s con- 
clusion! of hydrological separation formulated on the basis of the pre- 
cipitation offset. There is a Reply to this Brief Communication Arising 
by Evaristo, J., Jasechko, S. & McDonnell, J.J. Nature 536, http://dx.doi. 
org/10.1038/nature18947 (2016). 

The equation used in ref. 1 for the LMWL at a given sampling site is: 


8H =ai#O+b 


with a and b the slope and intercept with the y axis of the LMWL, 
respectively. Similarly for the xylem water EL: 


V’H=mb#’O+n 


with mand n the slope and intercept with the y axis of the xylem water 
EL, respectively. The intercept of the xylem water EL with the y axis 
is thus: 


n=5*H —m6O 


The x and y coordinates of the intersection between these two lines 
(respectively &'8O intersect and 67Hintersect) are obtained in a straight- 
forward manner (equations (1) and (2) below): 


18 a 18 
ab Ointersect + b= ms Ointersect +n 


n—b 

$ 180) ntersect = ( 1 ) 
a—m 

8°H intersect = 4 8 : SOintersect +b (2) 


leading to: 


2 
8 Hintersect = b 
a 


18 — 
6 Ointersect = 


Equation (3) above corresponds to equation (3) in ref. 1. Yet, it can be 
demonstrated that their equation (2), that is, 


2 — §2 18 
6 H intersect, Evaristo = $ H — ms O 


does not give the right &’Hintersect 8 &'8Ointersect is missing and should 
therefore be rewritten as: 


8°H intersect — 8°H = m(68O — 5180 intersect ) 


We applied equations (1) and (2) above to the complete data set pro- 
vided by the authors of ref. 1: we used their methodology together with 
our set of corrected equations to identify for each site the slope of the EL 
and the coordinates of the intersection between the EL and the LMWL 
(8'8Ointersect and 5?Hintersect)+ 

Out of the 47 available sites, 5'’O intersect ANd 5’Hintersect Could be com- 
puted for 46 sites. One site (ID 28) had only one plant xylem water 
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Figure 1 | Comparison between the erroneous and correct isotopic 
compositions of plant water source precipitation for two sites. 

The erroneous source precipitation oxygen and hydrogen isotopic 
compositions (green triangle in (6'8O, 6°H) space) were calculated with 
equations (2) and (3) of Evaristo et al.', whereas the correct ones (orange 
triangle) were calculated with equations (1) and (2) from the present work. 
The correct orange triangle is located at the intersection between the 
LMWL (solid black line) and the EL (dashed grey line). Shown are values 
for plant xylem water (grey triangles), mean groundwater (blue circle with 
error bars) and groundwater (blue circles) isotopic compositions plotted in 
(5'8O, 8H) space. a, An example from Oregon, USA (same example as in 
extended data figure 2 of Evaristo et al. 1 site ID 26). b, An example of an 
inconsistent data set, from Arizona, USA (site ID 3). Error bars, +1 s.d. 
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Table 1 | Site-specific plant water source and groundwater isotopic compositions 


Plant water source Plant water source Groundwater Groundwater Criterion 1 Criterion 2 Plant 
Site ID 8?Hintersect (Yoo) 8 Ointersect (Yoo) 8*Hew (%o) 8'8Oew (%o) (l=yes,O=no) (1=yes, O=no) versus GW 
1 —39 (10.8) —6.9 (1.5) —25.5 (1.8) —4.4 (0.3) 1 ad 
2 —105.3 (7.7) —13.7 (1.1) —93.8 (21.8) —12.1 (3.4) 1 i 
3 —2,653.4 (1416) —407.7 (217.2) —52.6 (3.1) —7.8 (1.3) (e) NA 
4 —137 (23.9) —18.4 (3) —80.7 (0.2) —11.1 (0) 1 ee 
5 —93.8 (8.2) —13.9 (1.3) —81.5 (8.6) —10.7 (0.9) 1 NS 
6 —92.5 (6.9) —12.2 (0.9) —49.8 (13) —74 (1.8) 1 ee 
7 —233.7 (21) —31.7 (2.8) —74.2 (7.6) —9.9 (1.1) 1 NA 
8 —399.7 (128.9) —57.7 (18.1) —43 (5.1) —6.4 (0.5) 1 NA 
9 —57.1 (18.9) —7.7 (2.5) —28.8 (19) —4,8 (1.9) 1 ee 
10 —41.5 (0) —6.6 (0) —29.2 (2.1) —4,3 (1.2) 1 ie 
11 —28.8 (7.2) —5.1 (1.1) —24.7 (0.4) —4,7 (0.4) 1 #e 
12 —24.1 (3.3) —5.4 (0.6) —21.2 (16) —4.2 (2) 1 S 
13 —72.5 (13) —9.5 (1.7) —67 (11.9) —9.6 (1.6) 1 S 
14 —72.5 (107) —9.8 (12.9) —65.2 (6.2) —9.2 (1.1) 1 S 
15 —64.9 (12.8) —11.3 (2.1) —40.1 (2.6) —7 (0.5) 1 He 
16 —83.7 (4.8) —12.8 (0.7) —67 (22.2) —8.8 (2.8) 1 ie 
17 11.6 (37.9) 0.2 (4.7) —46.6 (5.5) —6.7 (0.4) e) 1 A 
18 —100.4 (39.7) —143 (5.6) —53.4 (9.2) —8.3 (1.5) 1 1 ie 
19 —77.2 (11.9) —11.1 (1.6) —62.8 (4.8) —8.3 (0.8) 1 1 +e 
20 —17.5 (14.3) —1.6 (2.2) —74,7 (6.5) —9,7 (0.5) 0) 1 A 
21 —415.2 (183.1) —81.6 (35.7) —29,3 (2.8) —4 (1.1) 1 (e) NA 
22 —93.2 (0) —13.6 (0) —85.1 (5.9) —12.7 (1) 1 1 +e 
23 —551.2 (90.3) —69.9 (11.2) —106 (16.8) —14.1 (2) 1 (6) NA 
24 —893.1 (469.6) —121.3 (63.3) —89.9 (12.6) —12.1 (1.6) 1 6) NA 
25 —97.3 (40.6) —12.8 (5.2) —104.7 (5.4) —14.3 (0.8) 1 1 NS 
26 —103.8 (11.4) —14.6 (1.5) —87.8 (53) —11.9 (6.5) 1 1 NS 
27 —89.6 (4.3) —11.9 (0.5) —85.1 (5.9) —12.7 (1) S 
28 NA (NA) NA (NA) —56.1 (1.1) —8.8 (0.1) A A 
29 —128.8 (15.2) —16.5 (2.2) —108.3 (20.8) —13.6 (3.6) ied 
30 —81.1 (22.3) —10.6 (2.7) —91.7 (6.5) —11.5 0.5) iad 
31 —38.7 (11) —6.1 (1.4) —35 (0.2) —4,9 (0) S 
32 —586.4 (0) —83.9 (0) —28.4 (2.5) —5 (0.3) e) A 
33 —69.9 (34.7) —9.8 (4.6) —73.6 (10.9) —10(1) S 
34 —49.4 (7.2) —7.6 (0.9) —34.6 (5.2) —5.2 (1.3) ied 
35 —49.9 (10.5) —8.4 (1.6) —17 (1.3) —4.5 (0.3) 1 id 
36 —9.7 (1) —2.2 (0.2) —26.4 (2.9) —4.5 (0.2) 6) 1 A 
37 —43.7 (9.2) —6.8 (1.2) —29.3 (2.8) —4 (1.1) 1 ee 
38 —123.5 (36.9) —17.4 (4.9) —62.8 (15.8) —9 (2.1) 1 id 
39 —25.2 (0.8) —3.9 (0.1) —38.5 (15.9) —5.7 (2) 1 +e 
40 —72 (22.7) —12.2 (3.5) —21.6 (14) —4.5 (0.6) 1 ied 
41 —168.1 (49.6) —24.2 (6.5) —144 (2.7) —3 (0.3) 1 ie 
42 —118.7 (22.5) —15.7 (2.6) —48.5 (7.8) —7.6 (0.8) 1 ee 
43 —51.1 (16.4) —7.8 (1.9) —9.9 (7.5) —2.6 (0.8) 1 1 He 
44 —359.1 (145.8) —46 (17.8) —62.8 (15.8) —9 (2.1) 1 1 NA 
45 —97.2 (20.1) —12.4 (2.2) —46.6 (5.5) —6.7 (0.4) 1 1 #e 
46 —159.2 (129.2) —21.1 (15.9) —144 (2.7) —3 (0.3) 1 1 #e 
47 —50.3 (8.5) —7.5 (1.1) —144 (2.7) —3 (0.3) 1 1 we 


This table shows the hydrogen and oxygen isotopic compositions of the plant water source (5?Hintersect ANd 5! 80 intersect) for each site, presented as median (interquartile range), calculated using 
equations (1) and (2) of the present work. Values of groundwater hydrogen and oxygen isotopic compositions (6*Hgw and §!8Ogw) are also presented as median (interquartile range). Columns 6 and 7 
are criteria for the calculated intersection value: criterion 1, §*Hintersect < max(6*Hpiant); and criterion 2, 8*Hintersect > -200%o. NA, not applicable, that is, when at least one criterion was not met by site 
data, or when data were lacking to perform the statistical analysis. **Denotes statistically significant difference between plant water source and groundwater (GW) hydrogen stable isotopic composi- 
tions using the non-parametric Wilcoxon rank sum test (a=0.05). NS, not significant. 
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isotopic measurement and could therefore not be used for regression 
analysis. 

Figure 1a shows one example of the difference between the original 
estimate of Evaristo et al.' (green triangle) and our revised estimate 
(orange triangle) of the plant water source (site ID 26). For most of the 
sites, our estimates differ considerably from those of Evaristo et al.! (see 
Extended Data Figs 1, 2 and 3). 

Extended Data Figs 1, 2 and 3 also reveal that the data set is 
extremely heterogeneous in terms of the number of sampled points 
for plant and groundwater, with sometimes inconsistent data lead- 
ing to unrealistic values of 8 8O intersect and §7Hintersect for certain 
sites. Therefore, we applied two non-exclusive criteria to assess 
the consistency of the calculated intersection values: criterion 1, 
S’Hintersect < max(8’Hpjant); and criterion 2, 5’Hintersect > —200%0o. The 
first criterion implies that m < a, while the second criterion evaluates 
whether the plant water source hydrogen isotopic composition value 
is realistic. Eleven sites failed at least one of the two criteria (IDs 3, 7, 
8, 17, 20, 21, 23, 24, 32, 36 and 44). Figure 1b shows one example of an 
inconsistent data set (site ID 3). 

Results of this analysis (summarized in Table 1) show that at 26 sites, 
where data were consistent, 5° Hew was statistically different from 
SHintersect using the non-parametric Wilcoxon rank sum test (a=0.05). 
In conclusion, rainfall segregation (as defined by Evaristo et al.') could be 
observed for only 57% of the sites of the authors’ data set and at 74% of the 
sites with consistent estimates of the intercept as defined by our two criteria. 


Evaristo et al. reply 


Online Content Methods, along with any additional Extended Data display items and 
Source Data, are available in the online version of the paper; references unique to 
these sections appear only in the online paper. 
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REPLYING TO M. Javaux, Y. Rothfuss, J. Vanderborght, H. Vereecken & N. Bruggemann Nature 536, http://dx.doi.org/10.1038/18946 (2016) 


In the accompanying Comment', Javaux et al. correct a mistake in 
equation (2) of our work’; as they point out!, the mistake does not 
impact the central conclusion of our paper that ecohydrological 
separation is widespread. However, equations (2) and (3) in ref. 2 
calculate the source precipitation value of xylem water as the point 
where the xylem water evaporation line (EL) intersects the local 
meteoric water line (LMWL). In so doing, Javaux et al.' note that the 
mistake affects our finding” that “at 80% of the sites, the precipitation 
that supplies groundwater recharge and streamflow is different from the 
water that supplies parts of soil water recharge and plant transpiration”. 
There are two key points in our response. 
(1) We recognize the mistake now noted in equation (2) and thank 
Javaux et al.' for this correction. These authors! find that rainfall 
segregation could be observed at only 74% of the sites (as defined 
by the two criteria in ref. 1), and not 80% as we originally reported’. 
(2) Our work’ presented evidence for ecohydrological separation based 
on a meta-analysis of isotopic dual liquid water isotope data (6°H and 
5'80) from 47 studies. This conclusion is supported by studies that 
analysed water vapour isotope data from the Tropospheric Emissions 
Spectrometer aboard NASA’ss Aura satellite’ and by global differences 
between annual precipitation and groundwater isotope compositions*». 
These global-in-scale lines of evidence support earlier field evidence®” 
that ecohydrological separation (defined as plants using water of a 
character different to that of mobile water found in soils, groundwater 
and streamflow) is widespread, and is the rule rather than the exception. 
Ecohydrological separation was calculated using equation (1) in ref. 2. 
It must be understood that equation (1) in ref. 2 is independent of the 
source precipitation analysis, which was calculated using equations (2) 
and (3) in that paper. Therefore, any issue with equation (2) in our 
paper, like the one raised by ref. 1, does not affect the ecohydrological 
separation conclusion. 


We hope that this exchange will generate further interest in the use 
of stable O and H isotopes in plant water relation studies. 


J. Evaristo!, S. Jasechko? & J. J. McDonnell?:34 

1Global Institute for Water Security and School of Environment and 
Sustainability, University of Saskatchewan, Saskatoon, Saskatchewan S7N 
3H5, Canada. 

email: jaivime.evaristo@usask.ca 

2Department of Geography, University of Calgary, Calgary, Alberta T2N 
IN4, Canada. 
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Extended Data Figure 2 | As Extended Data Fig. 1 but for sites ID 17 to 32. Site ID 26 is in Fig. 1a of the present paper. 
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Extended Data Figure 3 | As Extended Data Fig. 1 but for sites ID 33 to 47. 
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AT MONASH, 
EXCELLENCE 
AND EQUITY 
IS NOT WHAT 
WE DO, IT IS 


At Monash, we are committed to ensuring our staff and 
students reflect the world we are working towards: diverse, 
inclusive, innovative and sustainable. We actively support 
women in senior roles, especially in the fields of science, 
technology, engineering, maths and medicine (STEM). 
Our commitment is demonstrated by our participation in 


the Science in Australia Gender Equity pilot of Athena SWAN. 


monash.edu/jobs 


Above all, we value excellence, no matter where you're from. 
As the largest university in Australia, with an international 
reputation for driving change, you can be assured that we 
have a wealth of resources to support you and your research. 
Join us at Monash University, where great minds come 
together to shape the future. 


R MONASH 
University 


NATURE CAREERS GUIDE 


Ningbo University 


Seeking bright minds 


Located in the historical port city 
of Ningbo in eastern China, Ningbo 
University is a burgeoning compre- 
hensive university co-established by 
the Chinese Ministry of Education, 
Zhejiang provincial government and 
Ningbo municipal government. Young 
and dynamic, Ningbo University 
is already ranked among the top 
100 universities in China. 


ingbo University is actively 
seeking talented researchers to 
strengthen its faculty team. 


Openings for academic leaders 
Requirements: 

- A doctoral degree from an overseas 
institution is expected, along with at 
east three years of work experience 
conducting research overseas; for 
those who have obtained their doctor- 
al degree from a domestic institution, 
at least three years of overseas teach- 
ing or research experience is a must 

+ Experience working as a tenured pro- 
essor or equivalent in a well-known 


university or research institution over- 
seas (associate professor experience 
is fine for young candidates from top 
universities or institutions); generally, 
candidates should qualify for the na- 
tional Thousand Talents Program 
* A proven track record of achieve- 
ments in a specialized research field, 
with the potential to become an aca- 
demic or technical leader in the field 
* Ability to work full-time on site, and 
preferably under 50 years old 
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Openings for top young scientists 
Requirements: 
- A doctoral degree from an overseas 
institution is preferred, along with at 
east three years of post-graduate re- 
search experience overseas; those with 
doctoral degrees from domestic insti- 
utions must have at least three years 
of experience conducting research or 
eaching overseas 
+ Experience working full-time in a well- 
nown university or research institu- 
ion overseas, conducting research or 
eaching; generally, candidates should 
qualify for the national Thousand 
Young Talents Program or the provin- 
cial Thousand Talents Program 
* Ability to work full-time on site, and 
preferably under 45 years old 


Openings for excellent doctoral 
researchers 
Requirements: 
- A doctoral degree from an overseas 
institution is preferred, along with at 
east three years of work experience 
conducting research overseas; those 
with doctoral degrees from domestic 
institutions should have at least three 
years of experience conducting re- 
search or teaching overseas 
+ A track record of publication experi- 
ence, with at least one paper pub- 
ished in Social Sciences Citation Index 
or Arts & Humanities Citation Index 
journals for candidates in humanities 
and social sciences fields; two or more 


papers published in Science Citation 
Index-listed journals or at least 


one publication in a top journal for 
candidates in natural sciences fields 
*  Abilityto work full-time at the university 


Compensation 
Generous compensation packages wil 
be available. For excellent doctoral re- 
searchers, the successful candidate wil 
receive a settling-in allowance of 600,000 
180,000+420,000) RMB. Those with fouro 
more publications in top journals are eli- 
gible to be hired as associate professors, 
and will receive a settling-in allowance o 
800,000 (600,000+200,000) RMB. 


Application procedure 
Please submit a completed application 


form, a curriculum vitae and a cover let- 
ter, along with other relevant supporting 
materials via e-mail to: rsc@nbu.edu.cn. 

For additional information regarding 
the application, such as the number of 
openings, please visit: http://www.nbu. 
edu.cn/shizi. 


Gk t B 


NINGBO UNIVERSITY 


Contact 

E-mail: rsc@nbu.edu.cn 
Tel. no: +86-574-87600288 
Website: |www.nbu.edu.cn/english 


NANYANG 


TECHNOLOGICAL 
UNIVERSITY 


Here are 3 good reasons to join NTU Singapore, 


THE WORLD’S TOP YOUNG UNIVERSITY 
TH E WORLD'S BEST Pursue your highest aspirations at the world’s 
fastest-rising young university that is also in 
YO U N G the top 13 of the global university league 
U N IV E R S ITY Reap results at this research-intensive 
university that leads the top Asian universities 
i lised h citation i (Th 

IS LOOKING FOR THE ipattare inches 2ore) ands tankedlee-aioeaih 
i ities in Nature Index 2016 

WORLD'S MOST among universi 


Rise in your chosen field as an elite Nanyang 
Assistant Professor with a start-up research 


grant of up to US$727,000 (S$1 million), 


an attractive remuneration package and 
a tenure-track appointment 


Apply for the 2017 Nanyang Assistant Professorship 

If you are an early-career researcher (postdoctoral fellow or equivalent), 
and are ready to lead your research group independently, write to us at 

nanyangprofessorship@ntu.edu.sg or visit www.ntu.edu.sg/nap to apply. 


Application deadline: 9 October 2016 www.ntu.edu.sg 


Interested candidates should contact: 


Rui Wang 

Vice director of the Department of 
Human Resources and Education 
Tel: +86-755-86392095 

E-mail: rui.wang@siat.ac.cn 


Job openings at SIAT 


The Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 


The Shenzhen Institutes of Advanced 
Technology (SIAT), Chinese Academy of 
Sciences (CAS), was jointly established by 
CAS, the Chinese University of Hong Kong 
and the Shenzhen Municipal Government 
in 2006. SIAT has earned an outstanding 
reputation for translating basic research 
into industrial applications, thanks to 
its interdisciplinary teams of scientists 
and engineers. At the interface between 
biological technologies and information 
technologies, SIAT is well positioned for 
innovation in the twenty-first century. 


SIAT is hiring for the following positions: 

e Researchers for the National Thousand 
Talents Program 

e Researchers for the CAS Hundred 
Talents Program 

« Distinguished research fellows 

¢ Professors, associate professors, and 
assistant professors 


Research fields include: 

« Advanced biomedical imaging and 
therapeutics technologies 

» Biomedical devices and medical 

robotics 

* Neural engineering and rehabilitation 

« Biomaterials and tissue engineering 

¢ Biomedical informatics and genome 

sequencing 

« Micro- nano- and bio-engineering 

« Nanomedicine, antibodies and 

immunology 

* Robotics and intelligent systems 

« High-performance and cloud 

computing 

« Information and communications 

technologies 

* Data mining 

« Precision engineering 

« Water-treatment technologies 

* Neurobiological mechanisms of brain 
disorders 
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Southwest Jiaotong University 


Invitation to apply for academic positions 


Located in Chengdu, the capital of China’s Sichuan province, 
Southwest Jiaotong University (SWJTU) is an elite comprehensive 
university founded in 1896. Excelling in engineering, SWJTU also 
covers the sciences and arts with its 19 faculties, institutes and 
centres. Its 2,600 faculty and staff offer complete undergraduate, 
graduate and postdoctoral programmes. 


SWIJTU cordially invites researchers to apply for positions across all 
its disciplines. 


Positions and requirements 

High-level Talented Leaders 

Applicants should be qualified for national top talents programmes 
such as the Program of Global Experts, Cheung Kong Scholars and 
National Science Fund for Distinguished Young Scholars. 


Young Leading Scholars 

Applicants should preferably be qualified for the following programmes: 

e National Thousand Young Talents Program 

« Top Young Talents of National Special Support Program 

e National Natural Science Foundation of China Excellent Young 
Scholars Program 

Applicants should also have a strong team spirit and leadership 

qualities, outstanding academic achievements, a broad academic 


vision and experience with international cooperation along with the 
potential to become leading academic researchers. 


Excellent Young Academic Backbones 
Applicants should preferably be under 40 years old with degrees 
from first-class universities or institutes in China or overseas. 


Excellent Postdoctoral Fellows 
Applicants should preferably be under 35 years old and have the 
potential to become excellent academic researchers. 


Compensation 

Salaries will be highly competitive, commensurate with 
qualifications and experience. Other benefits include settling-in 
allowances, housing subsidies, child education support, start-up 
funds, assistance in establishing scientific platforms and 
research groups, along with international training and promotion 
opportunities. Detailed packages are negotiable. 


Application procedure 

Please send a curriculum vitae, copies of academic credentials, 
a publication list with abstracts of selected published papers, a 
research plan, a teaching statement and the contact details of 
three referees to the Human Resources Department. 


For inquiries, please contact Mr. Yu Wang, Ms. Ye Zeng or Ms. Qingya Wang 


Telephone: +86-28-66366202 
E-mail: talent@swjtu.edu.cn 
Address: Human Resources Department, SWJTU, Western Park of High- 


Tech Zone, Chengdu, Sichuan, China, 611756 


Website: =www.swijtu.edu.cn 
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Located in the historical and pictur- PP iejiang University has always university. More than 160 of its alumni 
esque city of Hangzhou, Zhejiang y stayed true to its commitment have gone on to become members of 
University is a prestigious higher aad 0) cultivating talent with excel- he Chinese Academy of Sciences or the 
education institution in China. With a lence, advancing science and technol- Chinese Academy of Engineering. With a 
long history that can be traced back ogy, serving the well-being of society, rich campus culture, advanced teaching 
to the Qiushi Academy founded in and promoting advanced culture. This facilities and a wide range of internation- 
1897, the university enjoys a good spirit is best captured by the university’s al exchange opportunities, the university 
reputation at home and abroad. After motto of “seeking the truth and pio- is keen to create conditions to facilitate 
more than 100 years of development, neering new trails”. students’ development. 

it has become a comprehensive re- Following the educational philosophy Aiming to become a world-class uni- 
search university with unique charac- of putting people first, cultivating all- versity, Zhejiang University also focuses 
teristics. It has expanded to include ound competence in students, seeking on research innovation. It has launched 
7 campuses and 36 colleges and he truth and pioneering new trails in a number of international academic 
schools, and its research and training search of excellence, Zhejiang University platforms and has attracted many tal- 
programmes span 12 academic disci- is committed to cultivating future lead- ented researchers in various disciplines. 
plines. Of the 22 disciplines listed in ers who have international outlooks. It is To further strengthen its research teams, 
Thomson Reuters’ Essential Science also leading the country in educational he university has recently launched the 
Indicators (ESI) database, Zhejiang eforms. A large number of outstanding Hundred Talents Program, which seeks to 
University has 18 ranked in the top figures in modern Chinese history have encourage bright minds to join Zhejiang 
1% in the world. studied or conducted research at the University from home and abroad. 
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MULTIPLE OPENINGS FOR THE HUNDRED TALENTS PROGRAM 


The Hundred Talents Program is a highly + A proven track record of academic Internationally competitive startup 
competitive, well-funded programme achievement, comparable to those packages for research; and 
that covers all colleges and depart- of assistant professors or associate Large office and laboratory space 
ments at Zhejiang University. Successful professors in world-renowned uni- with state-of-the-art facilities. 
candidates are expected to establish in- versities; and 
ternationally competitive and indepen- A doctoral degree from a world- How to apply 
dent research programmes, focusing on renowned university, preferably with Qualified applicants are strongly en- 
cutting-edge science in their respective postdoctoral experience. Young and couraged to submit their applications 
fields. They are required to work full- mid-career researchers are particu- electronically to tr@zju.edu.cn with the 
time and deliver excellent research and larly encouraged to apply. following materials in PDF format: 
teaching at Zhejiang University, follow- + Acomprehensive CV; 
ing the spirit of “seeking the truth and What we offer + Astatement of research and a teach- 
pioneering new trails”. The successful candidate will be em- ing plan; 
ployed as a Zhejiang University 100 Certificates of academic degrees; 

Required qualifications Professor, who is qualified to supervise Three writing samples of major 
Candidates should have: doctoral students. The university will offer: publications; 
«+ A demonstrated commitment to + Agenerous compensation package; A list of three to five references with 

excellence in teaching and research; + Subsidized housing; detailed contact information. 


OTHER ACADEMIC OPENINGS 


In addition to the Hundred Talents Experts, Long-term Innovative Talents Part-time positions: 

Program, there are a number of talent and Young Professionals recruitment « The 1000 Talent Plan: Short-te 

recruitment programmes and open programmes Recruitment Program of Forei 

faculty positions for talented scholars heung Kong Scholar Program: Spe- Experts and Short-term Innovati 

worldwide, listed below. ial-term Professor & Young Scholar Talent programme 

rogrammes igh-end Foreign Experts Recrui 

Full-time positions: jiang Provincial Program for High- ment Program 

+ Zhejiang University Distinguished Overseas Talents Zhejiang Provincial Program for 
Professor of Humanities and Social jiang University Qiushi Distin- igh-level Overseas Talents (Short- 
Science ished Scholar term Innovative Talent) 

+ The 1000 Talent Plan: Long-term jiang University Leading Professor Zhejiang University Chair 
Recruitment Program of Foreign umanities and Social Science Professors 


For detailed application information on our Zhejiang University sincerely invites 
talent recruitment programmes, please highly talented scientists and scholars 
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